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Chimeras of Hepatitis C Virus and Bovine Viral Diarrhea Virus 

Reference to Government Grant 

This invention was made with government support under a grant from the National 
Institutes of Health, grant numbers PHS CA57973 and AI40034. The government has certain 
rights in this invention. 

5 

Related Applications 

This application claims priority to, and incorporates herein in its entirety, U.S. 
60/082,964 filed April 24, 1998. 

10 Background of the Invention 

( 1 ) Field of the Invention 

This invention relates generally to the development of therapies for treating hepatitis 
C virus (HCV) and bovine viral diarrhea virus (BVDV) and more particularly to the 
identification of such therapies using chimeric viruses comprising a genomic sequence 
15 derived from HCV and bovine viral diarrhea virus (BVDV). 

(2) Description of the Related Art 

The Flavivirdae is an important family of human and animal RNA viral pathogens 
(Rice, CM. 1996. Flavivirdae: The viruses and their replication. In: Fields BN, Knipe DM, 
Howley PM., eds. Fields virology. Philadelphia: Lippincott-Raven Publishers, pp. 931-960.) 

20 The three currently recognized genera of the Flavivirdae family exhibit distinct differences in 
transmission, host range, and pathogenesis. For example, members of the classical flavivirus 
genus, such as yellow fever virus and dengue virus, are typically transmitted to vertebrate 
hosts via arthropod vectors and cause acute self-limiting disease (Monath TP, Heinz FX. 
1996. Flaviviruses. In: Fields BN, Knipe DM, Howley PM., eds. Fields virology. New York: 

25 Raven Press, pp. 961-1034). The pestiviruses, such as bovine viral diarrhea virus (BVDV) 
and classical swine fever virus (CSFV), cause economically important livestock disease and 
are spread by direct contact or the fecal-oral route (Thiel et al., 1996. Pestiviruses. In: Fields 
BN, Knipe DM, Howley PM., eds. Fields virology. New York: Raven Press, pp. 1059-1073). 
The most recently characterized Flavivirdae genus is the hepacivirus genus, the sole member 

30 of which is the common and exclusively human pathogen, hepatitis C virus (HCV). HCV is 
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transmitted by contaminated blood or blood products and is the most common agent of non- 
A, non-B hepatitis, affecting more that 1% of the population worldwide (Houghton, 1996. 
Hepatitis C viruses. In: Fields BN, Knipe DM, Howley PM., eds. Fields virology. 
Philadelphia: Lippincott-Raven Publishers, pp. 1035-1058.). Unlike flavivirus and pestivirus 
5 infections, which are usually eliminated by host immune response, chronic HCV infections 
are common and can cause mild to severe liver disease including cancer. 

Despite these differences, members of the Flavivirdae family share common 
structural features and gene expression strategies. Virus particles consist of a lipid bilayer 
envelope with embedded transmembrane glycoproteins surrounding a protein-RNA 

10 nucleocapsid. Genome RNAs are single-stranded of positive polarity, and function as the sole 
mRNA species for translation of a single long open reading frame (ORF). This ORF is 
translated into a polyprotein which is processed by cellular and viral proteases into mature 
viral proteins. Structural proteins destined for incorporation into virus particles are encoded 
in the N-terminal portion of the polyprotein, while the nonstructural proteins which form 

1 5 components of the viral RNA replicase are encoded in the remainder. 

Replication of the Flavivirdae RNA genome occurs via synthesis of a full-length 
negative-strand intermediate and is asymmetric, favoring synthesis of positive-strand RNAs. 
However, little is known about the details of this process. For all three genera of the 
Flavivirdae family, full-length functional cDNA clones have been constructed and RNAs 

20 transcribed from these cDNA templates are infectious. For flaviviruses and pestiviruses, 
mutagenesis of these clones and efficient RNA transfection of permissive cell cultures 
provides a means of probing the role of cis RNA elements and viral proteins in replicase 
assembly and function. Such analyses are not yet possible for HCV since this virus is unable 
to replicate efficiently in cell culture. 

25 Like many other RNA viruses, it is believed the 5 1 and 3 ! terminal sequences of the 

Flavivirdae contain conserved cw-elements important for translation, RNA replication, and 
packaging (Bukh et al, Proc. Natl Acad. Set USA 59:4942-4946, 1992; Deng et al., Nucleic 
Acids Res. 21: 1949-1957, 1993; Cahour et al., Virol 207:68-76, 1995; Kolykhalov et al., J. 
Virol 70:3363-3371, 1996; Men et al., J. Virol 70:3930-3937, 1996; Tanaka et al., J. Virol 

30 70:3307-33 12, 1996; Huang HV. 1997. Evolution of the alphavirus promoter and the ex- 
acting sequences of RNA viruses. In: Saluzzo J-F, Dodet B. eds. Factors in the emergence of 
arbovirus disesases. Paris: Elsevier Press, pp. 65-79; Mandl et al, J. Virol 72:2132-2140, 
1998). The 5 ! nontranslated region (NTR) functions initially at the level of translation. 
Similar to most cellular mRNAs, flavivirus genome RNAs are translated in a cap-dependent 

35 manner. These RNAs contain a 5' cap structure that is presumably added by virus-encoded 
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RNA triphosphatases, guanylyl-, and methyl-transferases (Rice, 1996, supra). In contrast, the 
translational strategy employed by pestiviruses and HCV is more similar to that of the 
picoma viruses. These RNAs appear to be uncapped and contain long 5' NTRs with cis RNA 
elements that function as internal ribosome entry sites (IRES) for translation initiation at the 
5 polyprotein AUG (Lemon et al., Semin. Virol 5:274-288, 1997). 

The 5* NTRs of HCV and BVDV have a similar structural and functional organization 
despite containing only short stretches of high sequence identity (Wang et al., Curr. Top. 
Microbiol Immunol. 203:99-115, 1995; Lemon et al., 1997, supra). The IRES within each 
NTR is located at the 3' end of the NTR at a position proximal to the AUG initiation codon of 

10 the ORF. Although the 5' terminal sequence of each of these viruses is apparently not 
required for IRES function (Rijnbrand et al., FEBSLett 365:1 15-1 19, 1995; Honda et al., 
Virology. 222:31-42, 1996; Rijnbrand et al., Virol 77:451-457, 1997), these sequences are 
highly conserved among different strains of HCV (Bukh et al, Proc. Natl Acad. ScL 
L&4;S9:4942-4946, 1992) or BVDV (Deng efal., 1993, supra), suggesting they play other 

15 roles in viral replication. For example, sequerices in the 5' NTR may be required for 

regulating translation versus initiation of negative-strand RNA synthesis. Such regulation 
could occur by direct interaction of 5' and 3 1 RNA elements or indirectly, via RNA-protein 
interactions. Sequences in the 5' NTR may also modulate packaging versus translation. 
Finally, sequences complementary to the 5 f NTR, which are located at the 3' end of negative- 

20 strand RNA, are likely to function in the initiation of positive-strand RNA synthesis. 

The HCV 3' NTR contains an internal polypyrimidine tract followed by a highly 
conserved sequence of 98 bases at the 3' terminus, which has been shown to be required for 
replication of HCV (U.S. Application Serial No. 08/81 1,566). 

Further elucidation of the role of sequences in the HCV 5 ' and 3 ' NTRs has been 

25 hampered by the inefficient replication of HCV in cell culture. This aspect of HCV biology 
also makes it difficult to identify and test possible antiviral compounds for activity against 
HCV. Thus, a need exists for a system which facilitates investigation of HCV replication and 
therapeutic approaches to control HCV infections. 

30 Summary of the Invention 

Briefly, therefore, the present invention provides novel compositions and methods for 
studying HCV replication which are based on the discovery that chimeras of HCV and BVDV 
genomic sequences can be constructed that are able to replicate in cell culture. The BVDV- 
specific sequence provides the chimeric viral nucleic acid with the ability to replicate in cell 

35 culture, while the HCV-specific sequence allows the chimeric viral nucleic acid to be used to 
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screen possible compounds for anti-viral activity against HCV. It is believed that similar 
r i lion-competent chimeras can be constructed from HCV and other pestiviruses. 

Thus, in one embodiment, the present invention provides a novel, chimeric viral RNA 
in which at least one of the 5' NTR; ORF and 3' NTR regions is chimeric and comprises a 
5 nucleotide sequence from the corresponding region of a pestivirus in operable linkage with a 
nucleotide sequence from the corresponding region of an hepatitis C virus (HCV). The 
chimeric viral RNA is replication-competent. In preferred embodiments, the pestivirus is 
BVDV. 

In other embodiments, the invention provides a polynucleotide comprising a DNA- 

1 0 dependent promoter operably linked to a cDNA of a chimeric viral RNA as described above 
and cells transiently transfected or stably transformed with the polynucleotide. In some 
embodiments the cDNA may encode a dominant selectable marker or an assayable reporter. 

In yet another embodiment, the invention provides a method for identifying 
compounds having anti-HCV activity. The method comprises providing a first cell containing 

15 a chimeric viral nucleic acid derived from HCV and a pestivirus as described above and a 
second cell containing the pestivirus, and then comparing the replication efficiency of the 
chimeric viral nucleic acid in the presence and absence of a test compound to the replication 
efficiency of the pestivirus in the presence and absence of the test compound, 
wherein a greater reduction in compound-induced replication efficiency of the chimeric viral 

20 nucleic acid than the pestivirus indicates the compound has anti-HCV activity. 

The invention also provides a genetically-engineered virus which comprises a 
chimeric viral nucleic acid derived from HCV and a pestivirus as described above. In one 
embodiment the genetically-engineered virus comprises virus particles containing at least one 
HCV structural protein and is useful in a vaccine against HCV. In another embodiment, the 

25 genetically-engineered virus is attenuated as compared to the pestivirus and is useful as a 
vaccine against the pestivirus. 

In a still further embodiment, the invention provides a replication-competent BVDV 
vector expressing a heterologous sequence. The BVDV vector comprises the BVDV 
sequences encoding the BVDV replication machinery. In some embodiments, the replication- 

30 competent BVDV vector expresses an antigen and is useful as a vaccine. 

Brief Description of the Drawings 

Figure 1 is a schematic representation of the 5' NTRs of BVDV, HCV, and EMCV 
showing the position of the start codons of the ORF, and the boxes indicating the canonical 
35 IRES elements. 
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Figure 2 shows a schematic representation of BVDV and HCV chimeras, plaque 
phenotypes, reticulocyte translation efficiencies relative to parental BVDV, specific 
infectivities in MDBK cells, titers at 24 and 48 h post-transfection (or 72 h, as indicated), and 
an indication of whether pseudorevertants arose with results from BVDV, 5'HCV, 
5 BVDV+HCV, and BVDV+HCVdelB3 chimeras shown in Fig. 2A and results from 
BVDV+HCVdelB2B3, BVDV+HCVdelBlB2B3, BVDV+HCVdelB2B3Hl, and 
BVDV+HCVdelB2B3HlH2 shown in Fig. 2B, where N.D. means not determined. 

Figure 3 illustrates the in vitro translation efficiency of BVDV RNA or chimeras 
showing bar graphs of the amount of N 1 " 10 , the N-terminal protein in the BVDV ORF, 
1 0 expressed by the various constructs. 

Figure 4 illustrates a schematic representation of EMCV chimeras, plaque 
phenotypes, reticulocyte translation efficiencies relative to parental BVDV, specific 
infectivities in MDBK cells, titers at 24 and 48 h post-transfection (or 72 h, as indicated), and 
an indication of whether pseudorevertants arose. 
1 5 Figure 5 illustrates a pseudorevertant analyses showing in (Fig. 5 A) the relative 

positions of mutations detected within the plaque-purified variants of passaged 
BVDV+HCVdelBlB2B3, 5'EMCV, and 5'HCV, and in (Fig. 5B) the 5' terminal sequences of 
pseudorevertants of BVDV+HCVdelBlB2B3, 5'EMCV, and 5'HCV. Novel nucleotides or 
sequences are shown in bold upper case type. Pseudorevertants are numbered and designated 
20 by the suffix ".R". The upper case sequence in BVDV+HCVdelBlB2B3 and 

BVDV+HCVdelBlB2B3.Rl is a remnant of downstream BVDV 5 ! NTR sequences and was 
created during the cloning procedures. 

Figure 6 illustrates the construction of derivatives of 5'HCV designed to contain 5' 
termini corresponding to the sequence detected within the three analyzed pseudorevertants. 
25 Fig. 6A shows the 5' terminal sequence of the 5'HCV derivatives with the suffix (orig) 

designating a derivative containing the original 5' terminal sequence of the pseudorevertant; 
the suffix (cons) designating a derivative containing the cons ensus tetranucleotide sequence 
5*-GUAU at the same position; and novel sequences shown in bold upper case type. Fig. 6B 
shows plaque phenotypes, reticulocyte translation efficiencies relative to parental BVDV, 
30 specific infectivities in MDBK cells, and titers at 24 and 48 h post-transfection are indicated. 
Figure 7 illustrates a single step growth curve for various chimeric constructs 
showing released virus titers measured by performing plaque assays on MDBK cells 
transfected with various constructs. 

Figure 8 illustrates replication of BVDV RNA or chimeric derivatives in transfected 

35 MDBK cells. Equal numbers of MDBK cells (~ 8 x 10 6 ) were electroporated with 5 Dg of 
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each in vitro synthesized RNA. MDBK cells were also transfected with infectious yellow 
fever 17D and Sindbis RNAs to provide molecular mass markers: One fifth of the transfected 
cells were seeded on 35-mm dishes and incubated in D-MEM supplemented with 10% horse 
serum for 6 h at 37°C. The media were then replaced with 1 ml of fresh media containing 2 
g/ml of actinomycin D and 40 Ci/ml of 3 H-uridine. Incubations were continued for 10 h at 
37°C. RNAs were isolated as described in Materials and Methods, and 1/4 of the samples 
was denatured in glyoxal and loaded on an agarose gel. (A) Autoradiograph of the dried gel. 
Only the portion of the gel containing the genomic RNAs is shown. (B) Amount of 
radioactivity contained within the displayed fragments as determined by scintillation 
counting. BVDV, lane 1; 5'HCV, lane 2; BVDV+HCVdelB2B3, lane 3; 
BVDV+HCVdelB2B3Hl, lane 4; 5'HCV.Rlorig, lane 5; 5'HCV.Rlcons, lane 6; 
5'HCV.R3orig, lane 7; 5'HCV.R3cons, lane 8; 5'HCV.R2orig, lane 9; 5'HCV.R2cons, lane 10; 
yellow fever 17D, lane 11; Sindbis, lane 12; non-transfected MDBK cells, lane 13. The 
experiments shown is one of two repetitions which yielded similar results. 

Figure 9 illustrates the genetic map of plasmid pACNR/BUD. 

Figure 10 illustrates the sequence of low copy number plasmid pACNR/BVDV 
NADL (circular) harboring the functional cDNA of cytopathic BVDV NADL (positive sense 
cDNA 5' to 3 ! ; nt 1-12578. 

Figure 1 1 illustrates the sequence of infectious BVDV NADL (positive sense cDNA 

5' to 3'). 

Figure 12 illustrates the sequence of infectious non-cytopathic BVDV NADL lacking 
clns (positive sense cDNA 5 1 to 3 ! ). 

Figure 13 illustrates the sequence adapted HCV 5' NTR from 5'HCV/Rl.cons 
(positive sense cDNA 5* to 3'; only the sequence from the 5' base to the ATG initiating the 
polyprotein is shown). 

Figure 14 illustrates the sequence of adapted HCV 5' NTR from 5'HCV/Rl.orig 
(positive sense cDNA 5 1 to 3*; only the sequence from the 5' base to the ATG initiating the 
polyprotein is shown). 

Figure 15 illustrates the sequence of adapted HCV 5'NTR from 5'HCV/R2.cons 
(positive sense cDNA 5' to 3'; only the sequence from the 5' base to the ATG initiating the 

polyprotein is shown). 

Figure 16 illustrates the sequence of adapted HCV 5' NTR from S'HCVy^.orig 
(positive sense cNDA 5' to 3'; only the sequence from the 5' base to the ATG initiating the 
polyprotein is shown). 
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Figure 17 illustrates the sequence of adapted HCV 5' NTR from 5'HCV/R3.cons 
(positive sense cDNA 5' to 3*; only the sequence from the 5'base to the ATG initiating the 
polyprotein is shown). 

Figure 18 illustrates the sequence of adapted HCV 5'NTR from 5'HCV/R3.orig 
5 (positive sense cDNA 5* to 3'; only the sequence from the 5' base to the ATG initiating the 
polyprotein is shown). 

Figure 19 illustrates the sequence of prototype HCV-BVDV chimera from 
pN ADL/5 'HR3 .orig/3 ! H3 ! B with the adapted HCV 5'NTR from 5'HCV/R3.orig and tandem 3 f 
NTR elements from HCV followed by BVDV (positive sense cDNA 5* to 3') as discussed in 
10 Example 5. 

Figure 20 illustrates various deletions of the poly U track in the 3 ! NTR HCV 
sequence of BVDV/HCV chimera p5H-3H33. 

Figure 21 illustrates the schematic representation of functional HCV/-BVDV chimera 
from pCBV/p7. 

1 5 Figure 22 illustrates the sequence of functional HCV-BVDV chimera from pCB V/p7 

(positive sense cDNA 5' to 3 f ). 

Figure 23 illustrates the schematic representation of a HCV/BVDV chimera with 

selectable marker. 

Figure 24 illustrates the sequence of functional HCV-BVDV chimera from 
20 pCBV/p7/IRES-pac expressing a dominant selectable marker conferring resistance to 
puromycin (positive sense cDNA 5 1 to 3*). 

Figure 25 illustrates the schematic representation of a bicistronic HCV/BVDV 

chimera. 

Figure 26 illustrates the sequence of functional bicistronic chimera expressing the 
25 entire HCV structural region derived from plasmid pNADL/BI#41/HCV str (positive sense 
cDNA 5' to 3 f ) 

Description of the Preferred Embodiments 

In accordance with the present invention, the inventors herein have succeeded in 

30 generating HCV-BVDV chimeric RNAs which are replication competent. Such chimeras are 
useful in screening compounds in vitro for antiviral activity against HCV. In addition, it is 
believed that in vivo replication of HCV-BVDV chimeras according to the invention may be 
attenuated as compared to wild-type BVDV and thus may be useful in vaccinating animals 
against BVDV. It is also believed that the HCV chimeric structures described herein for 

35 BVDV are applicable to other pesti viruses. 
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In the context of this disclosure, the following terms will be defined as follows unless 
otherwise indicated: 

"Cis-acting sequences" means the nucleotide sequences from an RNA virus genome 
that are necessary for recognition of the genomic RNA by specific protein(s) of the RNA 
5 virus or host cell that carry out replication, transcription, translation or packaging of the 
genome. 

"Genetically-engineered virus" means any virus whose genome is different than that 
of a wild-type virus due to a human-made deletion, insertion, or substitution of one or more 
nucleotides to the wild-type viral genome. 
10 "Infectious" when used to describe a virus means the virus is capable of entering cells 

and utiating a virus replication cycle, whether or not this leads to the production of new 
RNA virus particles. 

"Nucleotide sequence" as used herein refers to DNA and the corresponding RNA 
sequence where relevant. It will be understood that sequences shown in the Figures are DNA 
1 5 versions of the RNA sequence and that chimeric molecules of the invention may comprises 
RNA molecules or cDNA copies of such RNA molecules. 

"Replication-competent" as applied to a chimeric HCV-pestivirus RNA means the 
RNA is capable of RNA-dependent replication in at least one cell type that supports 
replication of the wild-type parental pestivirus. The number of replicated RNA molecules 
20 produced by an HCV-pestivirus chimeric RNA of the invention is at least 10-fold higher than 
the limit of detection, which is typically 10 to 100 molecules. More preferably, chimeric 
RNA production by the HCV-pestivirus chimeric RNA is at least 10 2 to 10 3 -fold higher than 
the detection limit. The replication-competent chimeric RNA replicates at an efficiency that 
is preferably, at least 0.001%, more preferably, at least 0.01%, more preferably, at least 0.1%, 
25 more preferably, at least 1%, more preferably at least 10% and most preferably at least 50% 
up to 90% that of the parental pestivirus in the same cell type. 

"Transfected cell" means a cell containing an exogenously introduced nucleic acid 
molecule, and includes cells that are transiently transfected with the exogenous nucleic acid. 
"Transformed cell" or "stably transformed cell" means a cell containing an 
30 exogenously introduced nucleic acid molecule which is present in the cytoplasm or nucleus of 
the cell and may be stably integrated into the chromosomal DNA of the cell. 
"Virus" means a virion, virus particle or a viral genome. 

A chimeric viral RNA according to the invention is designed to comprise a 5' NTR, 
an ORF, and a 3' NTR, at least one of which is a chimeric region containing two operably 
35 linked nucleotide sequences that are from the same region of a pestivirus and an HCV. 
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Pestivirus-specifk sequences useful in the invention can be taken from the appropriate 
genomic region of any cytopathic or noncytopathic type I or type II BVDV isolate, classical 
swine fever virus (CSFV) isolate, or border disease viral isolate. For a list of pestiviruses , 
see Thiel, H.-J., P. G. W. Plagemann, and V. Moennig. 1996. Pestiviruses, p. 1059-1073. In 
5 B. N. Fields, D. M. Knipe and P. M. Howley (ed.), Fields Virology. Raven Press, New York. 
HCV-specific sequences can be taken from any strain or isolate of HCV, including but not 
limited to HCV-1, HCV-la, HCV-lb, HCV-lc, HCV-2a, HCV-2b, HCV-2c, HCV-3a . 
Preferably, the parental pestivirus is a cytopathic strain of BVDV and the parental HCV strain 
is HCV-1. 

10 The pestivirus- and HCV-specific sequences are operably linked in the chimeric 

region, meaning the sequences are arranged such that the resulting chimeric structure is 
functional in the context of replication of the pestivirus. For example, in one preferred 
embodiment the chimeric viral RNA comprises a chimeric 5' NTR which comprises a 
BVDV-specific 5' terminal sequence of 5'-(G/A)UAU and an IRES derived from HCV, with 

1 5 the ORF and the 3 ' NTR consisting of a sequence from the same regions of BVDV. The 

BVDV-specific sequences at the 5' terminus and in the ORF and 3' NTR are chosen such that 
they are functional in the context of BVDV, meaning the chimeric viral RNA expresses the 
replication machinery of BVDV and this replication machinery is capable of replicating the 
chimeric RNA. In addition, translation of the BVDV ORF in the chimeric viral RNA is 

20 dependent upon a functional HCV IRES. The presence of a functional HCV IRES in this 

chimera allows the chimera to be used to screen for compounds that target the HCV IRES and 
thereby inhibit translation of the BVDV ORF as well as replication of the chimeric virus. 
Such compounds would be expected to also inhibit translation of the ORF in a wild-type HCV 
and consequently inhibit HCV replication. 

25 Compounds that could be screened for anti-HCV activity using this and other HCV- 

BVDV 5 ' NTR chimeras include but are not limited to antisense RNAs, RNA decoys that 
bind proteins involved in recognition of the HCV-specific sequences, ribozymes, and small 
molecule inhibitors of critical RNA-protein interactions. The use of such substances for 
therapeutic applications are known in the art. See, e.g., Amarzguioui M, et al., "Hammerhead 

30 ribozyme design and application." Cell Mol Life ScL 1998 Nov;54(ll): 1175-202; WelchPJ, 
et al., "Expression of ribozymes in gene transfer systems to modulate target RNA levels.", 
Curr Opin Biotechnol 1998 Oct;9(5):486-96; Bramlage B, et al. "Designing ribozymes for 
the inhibition of gene expression."; Trends Biotechnol. 1998 Oct;16(10):434-8; Gewirtz AM, 
et al. "Nucleic acid therapeutics: state of the art and future prospects."; Blood. 1998 Aug 

35 1 ;92(3):712-36; Altman S., "RNase P in research and therapy." Biotechnology (N Y). 1995 
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Apr;13(4):327-9; Flanagan WM., "Antisense comes of age."; Cancer Metastasis Rev. 1998 
Jun; 17(2): 169-76; Agrawal S, et al., "Antisense therapeutics." CurrOpin Chem Biol. 1998 
Aug;2(4):5 19-28; Caselmann WH, et al., "Synthetic antisense oligodeoxynucleotides as 
potential drugs against hepatitis C. H Intervirology 1997;40(5-6):394-9; Neckers LM., 
5 "Oligodeoxynucleotide inhibitors of function: mRNA and protein interactions." Cancer J Sci 
Am. 1998 May;4 Suppl l:S35-42; Agrawal S, et al. "Mixed backbone oligonucleotides: 
improvement in oligonucleotide-induced toxicity in vivo." Antisense Nucleic Acid Drug Dev. 
1998 Apr;8(2): 135-9; Crooke ST., "An overview of progress in antisense therapeutics." 
Antisense Nucleic Acid Drug Dev. 1998 Apr;8(2): 1 15-22; Fraisier C, et al., "High level 
1 0 inhibition of HIV replication with combination RNA decoys expressed from an HIV-Tat 
inducible vector."; Gene Then 1998 Dec;5(12): 1665-76; Gervaix A, et al. "Gene therapy 
targeting peripheral blood CD34+ hematopoietic stem cells of HIV-infected individuals." 
Hum Gene Ther. 1997 Dec 10;8(l8):2229-38; Nakaya T, et al. "Inhibition of HIV-1 
replication by targeting the Rev protein." Leukemia 1997 Apr; 1 1 Suppl 3: 134-7; Nakaya T, et 
15 al. "Decoy approach using RNA-DNA chimera oligonucleotides to inhibit the regulatory 
function of human immunodeficiency virus type 1 Rev protein." Antimicrob Agents 
Chemother. 1997 Feb;41(2):3 19-25; Smith C, et al. "Transient protection of human T-cells 
from human immunodeficiency virus type 1 infection by transduction with adeno-associated 
viral vectors which express RNA decoys." Antiviral Res. 1996 Oct;32(2):99-l 15; Bahner I, et 
20 al. "Transduction of human CD34+ hematopoietic progenitor cells by a retroviral vector 
expressing an RRE decoy inhibits human immunodeficiency virus type 1 replication in 
myelomonocytic cells produced in long-term culture." J Virol. 1996 Jul;70(7):4352-60; Lee 
SW, et al. "Inhibition of human immunodeficiency virus type 1 in human T cells by a potent 
Rev response element decoy consisting of the 13-nucleotide minimal Rev-binding domain." J 
25 Virol. 1994 Dec; 6 8(1 2): 8 254-64; Lisziewicz J, et al. "Inhibition of human immunodeficiency 
virus type 1 replication by regulated expression of a polymeric Tat activation response RNA 
decoy as a strategy for gene therapy in AIDS." Proc Natl Acad Sci USA. 1993 Sep 
l;90(17):8000-4; Bevec D, et al. "Inhibition of human immunodeficiency virus type 1 
replication in human T cells by retroviral-mediated gene transfer of a dominant-negative Rev 
30 trans-activator." Proc Natl Acad Sci USA. 1992 Oct 15;89(20):9870-4. 

It is contemplated that a number of replication-competent chimeric structures can be 
made that allow the function of various HCV sequence elements and proteins to be studied 
and targeted in drug screening assays. For example, the invention includes replication- 
competent HCV-pestivirus chimeras having a chimeric ORF. One such chimeric ORF is one 
35 comprising an HCV sequence encoding the structural proteins and a pesti virus sequence 
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encoding the nonstructural proteins. It is believed that upon introduction into a cell, such a 
HCV-BVDV ORF chimera will produce HCV-like virus particles that will be released from 
the cell and capable of infecting cells normally infected by wild-type HCV, i.e., cells 
expressing an HCV receptor such as human CD81 . Such ORF chimeras would be useful to 
5 screen compounds for drugs that inhibit formation, release or entry of HCV particles. In 
addition, ORF chimeras that produce virus particles containing at least one HCV structural 
protein would be useful as vaccines against HCV. Other ORF chimeras contemplated by the 
invention include, for example, chimeras comprising a pestivirus sequence encoding 
structural proteins and an HCV sequence encoding one or more nonstructural proteins such as 

10 the NS3 protease, NS4A cofactor, NS5A phosphoprotein/interferon resistance determinant 
and/or the NS5B polymerase. Replication of such ORF chimeras would be dependent upon 
the function of the HCV nonstructural protein(s) and these ORF chimeras could be used to 
screen for drugs that target the HCV nonstructural protein(s) as well as to screen for and map 
potential drug resistance mutations in HCV nonstructural proteins. In addition, HCV- 

15 pestivirus ORF chimeras could be useful for developing alternative in vivo animal models for 
HCV replication and HCV-associated hepatocellular carcinoma to evaluate antivirals and 
anti-tumor agents. 

The invention also provides replication-competent HCV-pestivirus chimeras having a 
chimeric 3 ' NTR which contains one or more conserved elements of the HCV 3' NTR. Such 

20 3 ' NTR chimeras would be useful for screening or evaluating compounds targeted against the 
HCV 3' NTR. Compounds that could be screened include antisense RNA molecules, 
ribozymes and small molecule inhibitors of critical RNA-protein interactions. One 3' NTR 
chimera according to the invention comprises a BVDV 5' NTR, BVDV ORF and a chimeric 
3 ' NTR which consists of an HCV-specific sequence derived from the HCV 3' NTR 

25 immediately followed by a BVDV 3 ' NTR. The HCV-specific 3 ' NTR that allows for 

replication in the context of BVDV has a deletion in the 3 ' NTR poly (U) tract but has all the 
other HCV 3' NTR elements, including the 98 bp 3' terminal conserved element. 

HCV-pestivirus chimeras included within the scope of the invention include those 
comprising combinations of chimeric regions, i.e., 5' NTR and ORF chimeras; 5' NTR and 3' 

30 NTR chimeras; ORF and 3' NTR chimeras; and chimeric RNAs in which each of the 5 ' NTR, 
ORF and 3 ' NTR regions comprise an HCV sequence operabiy linked to a pestivirus 
sequence. 

The invention also provides chimeric RNAs having two ORFs, or bicistronic HCV- 
pestivirus chimeras. Bicistronic chimeras contemplated by the invention include structures in 
35 which the first ORF contains one or more HCV genes and is followed by a second IRES 
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operably linked to a second ORF encoding the pestivirus replicase machinery. It is also 
contemplated the first ORF may encode a heterologous sequence such as an antigen. 

It is believed that many HCV-pestivirus chimeras of the invention will be attenuated 
as compared to the parental wild-type pestivirus. Such attenuated chimeric RNA genomes 
5 would be candidate vaccines in the form of live-attenuated virus particles or as RNA or 
cDNA "j.^netic" vaccines. 

The invention also includes vaccines against HCV which comprise an 
immunogenically-effective amount of HCV-pestivirus particles or nucleic acid. Anti-HCV 
vaccines comprising virus particles should preferably contain one or more HCV structural 
10 proteins. 

The therapeutic or pharmaceutical compositions of the present invention can be 
administered by any suitable route known in the art including for example by injection such 
as intraperitoneal, intravenous, subcutaneous, intramuscular, transdermal, intrathecal or 
intracerebral injection. Administration can be either rapid as by injection or over a period of 

15 time as by slow infusion or administration of slow release formulation. 

Compositions according to the invention can be employed in the form of 
pharmaceutical or veterinary preparations. Such preparations are made in a manner well 
known in the pharmaceutical and veterinary arts. One preferred preparation utilizes a vehicle 
of physiological saline solution, but it is contemplated that other pharmaceutically acceptable 

20 carriers such as physiological concentrations of other non-toxic salts, five percent aqueous 
glucose solution, sterile water or the like may also be used. It may also be desirable that a 
suitable buffer be present in the composition. Such solutions can, if desired, be lyophilized 
and stored in a sterile ampoule ready for reconstitution by the addition of sterile water for 
ready injection. The primary solvent can be aqueous or alternatively non-aqueous. 

25 The carrier can also contain other pharmaceutically-acceptable excipients for 

modifying or maintaining the pH, osmolality, viscosity, clarity, color, sterility, stability, rate 
of dissolution, or odor of the formulation. Similarly, the carrier may contain still other 
pharmaceutically-acceptable excipients for modifying or maintaining release or absorption or 
penetration across the blood-brain barrier. Such excipients are those substances usually and 

30 customarily employed to formulate dosages for parenteral administration in either unit dosage 
or multi-dose form or for direct infusion into the cerebrospinal fluid by continuous or periodic 
infusion. 

It is also contemplated that certain formulations containing a chimeric virus according 
to the invention are to be administered orally. Such formulations are preferably encapsulated 
35 and formulated with suitable carriers in solid dosage forms. Some examples of suitable 
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carriers, excipients, and diluents include lactose, dextrose, sucrose, sorbitol, mannitol, 
starches, gum acacia, calcium phosphate, alginates, calcium silicate, microcrystalline 
cellulose, polyvinylpyrrolidone, cellulose, gelatin, syrup, methyl cellulose, methyl- and 
propylhydroxybenzoates, talc, magnesium, stearate, water, mineral oil, and the like. The 
5 formulations can additionally include lubricating agents, wetting agents, emulsifying and 
suspending agents, preserving agents, sweetening agents or flavoring agents. The 
compositions may be formulated so as to provide rapid, sustained, or delayed release of the 
active ingredients after administration to the patient by employing procedures well known in 
the art. The formulations can also contain substances that diminish proteolytic degradation 

1 0 and promote absorption such as, for example, surface active agents. 

The specific dose is calculated according to the approximate body weight or body 
surface area of the patient or the volume of body space to be occupied. The dose will also be 
calculated dependent upon the particular route of administration selected. Such calculations 
can be made without undue experimentation by one skilled in the art. Exact dosages are 

1 5 determined in conjunction with standard dose-response studies. It will be understood that the 
amount of the composition actually administered will be determined by a practitioner, in the 
light of the relevant circumstances including the condition or conditions to be treated, the 
choice of composition to be administered, the age, weight, and response of the individual 
patient, the severity of the patient's symptoms, and the chosen route of administration. Dose 

20 administration can be repeated depending upon the pharmacokinetic parameters of the dosage 
formulation and the route of administration used. 

Replication-competent HCV-pestiviruses are generated by choosing the HCV 
function or sequence element desired to be studied. The HCV sequence can be obtained from 
a plasmid clone of a partial or full HCV genome using PCR to amplify a target region 

25 containing the desired sequence or by restriction enzyme digestion. The HCV fragment is 
then inserted into the desired location of a clone of the pestivirus genome using standard 
techniques. Desired portions of the pestivirus genome may be deleted before or after addition 
of the HCV fragment. The recombinant genome is then trarisfected into a cell that supports 
replication of the parental pestivirus genome and their ability to replicate using standard 

30 assays. For example, replication can be assessed by virus-induced cytopathic effect; plaque 
formation; detection of viral antigens and/or viral RNA accumulation; and by plaque assay 
measuring released infectious virus. The inventors herein have found that the BVDV RNA 
replication machinery works in many cell types, including bovine, hamster, mouse and human 
cells. It has also been reported that BVDV RNAs can amplify in other cell types including 

35 human hepatoma lines and hepatocytes (Behrens SE, et al., J Virol. 1998 Mar;72(3):2364-72). 
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The host cell range for a particular chimera will be dependent upon the properties of that 
chimera as empirically determined. 

As described below, some chimeras do not replicate stably as indicated by 
heterogeneity in the size of plaques produced by the chimeric virus. Upon passage, 
5 pseudorevertants can frequently be isolated that are capable of stable replication. Such 
pseudorevertants will have one or more deletions or base substitutions in the HCV and/or 
pestivirus sequences. Information derived from these gain-of-function mutations can be used 
to define the elements necessary for generating stable, replication-competent chimeras of 
HCV and a pestivirus. 

10 The invention provides a method for screening compounds for antiviral activity 

against HCV. The method involves comparing a test compound's effect on replication of a 
chimeric HCV-pestivirus RNA molecule as described above with the compound's effect on 
replication of the parental pestivirus. Compounds which have a greater effect on replication 
of the chimeric virus than the pestivirus are likely directed against the HCV portion of the 

1 5 chimera. Typically, the method is performed by providing duplicate cell cultures containing a 
chimeric viral RNA which is replication-competent in that cell, treating one of the culture 
with the test compound, and then measuring the replication efficiency of the chimeric RNA in 
both cultures. Any effect induced by the compound is compared against the compound's 
effect on replication of the parental pestivirus in cells of the same type. This control assay is 

20 preferably performed at the same time using the same culture conditions. 

The cells used in the screening assay can be prepared by transiently transfecting the 
cells with the desired chimeric RNA molecule as described below. Alternatively, it is 
contemplated that the chimeric RNA molecule can be constitutively expressed in the cell by 
transfecting the cell with a polynucleotide comprising a cDNA of the chimeric RNA operably 

25 linked to a DNA-dependent promoter. The chimeric cDNA may include a selectable marker, 
which would allow for selection of cells expressing the chimeric RNA. It is also envisioned 
the selectable marker could be a dominant marker that allows selection of cells expressing 
chimeras having adaptive mutations or selection of cells permissive for virus replication 
(Frolov et al., J. Virol 73:3854-3865, 1999). It is also contemplated the cDNA could express 

30 a reporter gene that could be assayed to measure RNA replication. 

Alternatively, chimeric virus particles are incubated with a cell permissive for 
infection by the pestivirus in the presence or absence of the test compound and then 
replication of the chimeric virus is measured and compared to the replication of the parental 
pestivirus incubated with the same cell type in the presence or absence of the test compound. 
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Inhibition of replication can be measured in many ways, including assaying for the 
reduction of virus-induced cytopathic effect; inhibition of plaque formation, reduced 
production of viral antigens as detected by immunofluoresence assay; reduced viral RNA 
accumulation; reduction in released infectious virus from treated and untreated control and 
5 chimera samples using a plaque assay. In addition, it is contemplated that a cell line that is 
designed for pestivirus-specific transactivation of a reporter gene could be used directly or in 
lieu of a plaque assay. The reporter gene is operably linked to a promoter that is activated 
upon infection by the chimeric virus and production of the viral transactivator protein. 

Preferred embodiments of the invention are described in the following examples. 
10 Other embodiments within the scope of the claims herein will be apparent to one skilled in the 
art from consideration of the specification or practice of the invention as disclosed herein. It 
is intended that the specification, together with the examples, be considered exemplary only, 
with the scope and spirit of the invention being indicated by the claims which follow the 
examples. 

15 Example 1 

This example illustrates the construction and analysis of 5 ' HCV-BVDV chimeras as 
reported in detail in Frolov et al. (RNA 4:1418-1435, 1998) which is incorporated in its 
entirety by reference. A functional clone of BVDV (Mendez et al., Virol. 72:4737-4745, 
1998) was used to construct and characterize a series of 5 1 NTR chimeras with sequences 
20 derived from HCV and the picornavirus, encephalomyocarditis virus (EMCV). The results 
help to define the requirements of a functional BVDV 5' NTR and provide replication- 
competent BVDV-HCV chimeras dependent on a functional HCV IRES. 

Example 2 

This example illustrates the construction of chimeras for expressing additional 
25 functional portions of the HCV genome by addition of further HCV sequence downstream 
from the functional or adapted HCV S'NTR chimeras fused in-frame to the BVDV ORF. 

One such construct (Figure 21) involves fusion of HCV sequences to BVDV 
sequences in the p7 protein coding region (at a convenient BseRI restriction site). Both HCV 
and BVDV encode a p7 protein that is located immediately downstream of the E2 protein. 
30 The p7 protein is a small hydrophobic protein of unknown function. pCBV/p7 consists of the 
first 79 bases of the BVDV 5WR encoding stem loop structure BT andBl, followed by the 
entire HCV 5'NTR, the entire HCV structural protein coding region and the first 36 amino 
acids of HCV p7 fused to the C-terminal 3 1 amino acids of BVDV p7. The fused p7 gene is 
followed by the remainder of the BVDV ORF including the entire nonstructural region and 
35 the BVDV 3' NTR. Transfection of MDBK cells with the RNA corresponding to this 
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sequence (Fig. 22) leads to replication of the chimeric RNA and production of the expected 
HCV and BVT 'yprotein cleavage products. Variations on this strategy are envisioned in 
inch all or pan oi the HCV polyprotein and cis elements important for RNA packaging can 
be expressed in viable chimeras. In addition the BVDV replicase regions for either cytopathic 
5 or non-cytopathic pestiviruses (like NADL clns-) can be used. Transfection of cells 

permissive for HCV particle, assembly, release and reinfection with this chimeric RNA can 
be used to make HCV-like particles. These particles and this infection system can be used (i) 
to screen for specific inhibitors of HCV particle, assembly, release and reinfection, (ii) for 
identifying antibodies capable of neutralizing HCV infectivity and (iii) as live or inactivated 
10 vaccines. Furthermore, this embodiment of the invention demonstrates that the BVDV RNA 
replication machinery can be used for expression of heterologous RNA and polypeptide 
sequences and can be used as a vehicle for RNA or DNA "genetic" vaccination in which the 
BVDV replicase amplifies the level of antigen expression by cytoplasmic RNA-dependent 
replication. 

15 

Example 3 

This example illustrates chimeric RNAs that are modified to express dominant 
selectable markers, assayable markers or FACS sortable markers. 

Such variants can be used to select for chimeras capable of replication in particular 

20 cell types, or to screen for cell types that are permissive for replication of the chimeric RNA. 
Selectable markers include, but are not limited to, the genes encoding puromycin resistance 
(puromycin N-acetyl transferase; PAC), neomycin resistance, blasticidin resistance, 
hygromycin resistance, etc. Assayable markers include, but are not limited to, the genes 
encoding B-galactosidase, luciferase, B-glucuronidase, etc. Easily sortable molecules include / > 

25 single chain antibodies, cell surface markers, and non-toxic protein markers like green 
fluorescent protein. In a specific example (Figures 23 and 24), the RNA encoded by 
pCBV/p7 was modified to include a cassette at the beginning of the BVDV 37sTTR that is 
comprised of the EMCV IRES driving the gene encoding PAC. This chimeric RNA can 
replicate, expresses PAC and confers resistance to puromycin resistance. This property can 

30 be used to select for variants of the chimera that are capable of noncytopathic replication in 
desired cells type and also provides a means of showing that cells harbor a functional 
chimeric RNA. Desired variants can be identified, cloned and further characterized as 
described in Example 1. Of note, is that this location in the BVDV genome and this strategy 
for expressing heterologous genes may also be applied to using infectious attenuated 
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pestiviruses as gene expression vectors and as chimeric live vaccines against other animal 
pathogens. 

Example 4 

5 

This example illustrates the use of the bicistronic strategy as an alternative to the in- 
frame fusions described in Example 2. 

A specific example is shown in Figure 25 and its sequence as Figure 26. In this 
bicistronic chimera, the 5* sequences are identical to that of pCBV/p7 except that the HCV 

10 ORF continues to include the first 246 amino acids of NS4B. The HCV sequence is followed 
by the EMCV IRES fused to BVDV Npro, the N-terminal 10 aa of BVDV C, the C-terminal 
19 aa of C, 9 N-terminal amino acids of Erns, 48 C-terminal amino acids of E2 and the 
remainder of the BVDV NADL ORF and 3' NTR. The constructed BVDV ORF encodes a 
functional BVDV RNA replicase. The deletions in the N-terminal portion of this ORF were 

1 5 designed to preserve proper membrane topology and processing of the replicase. The 

bicistronic chimeric RNA can replicate upon transfection of permissive BVDV host cells. 

Example 5 

20 This example illustrates 3 ! NTR chimeras. Although initial attempts to recover viable 

chimeric viruses in which the BVDV 3 f NTR was completely replaced by that of HCV were 
unsuccessful, a strategy similar to that detailed in Example 1 has produced chimeras that 
harbor the conserved elements of the HCV 3'NTR. An initial tandem 3 ! NTR construct was 
made in which the HCV 3 'NTR was engineered to follow the BVDV ORF. The complete 

25 BVDV 3'NTR was position 3' to the HCV 3' NTR after a short heterologous sequence. This 
sequence of this parental construct, which replicated poorly, is shown in Figure 19 RNAs 
transcribed from this plasmid were of low specific infectivity suggesting that revertants or 
pseudorevertants might have arisen. Indeed isolation and sequence analysis of several 
independent plaque-forming variants revealed that deletions in the HCV poly U tract of 

30 various lengths had occurred. These revertant sequences are shown in Figure 20. When these 
altered HCV 3'NTRs were reconstituted into the original tandem 3' NTR parent, they gave 
rise to plaque forming RNA transcripts of high specific infectivity, demonstrating that these 
alterations restored the ability of the chimeric RNA to replicate. Large deletions in the U tract 
gave rise to virus with more robust replication and larger plaques while stably maintaining the 

35 conserved HCV 3'NTR 98-base element and the polypyrimidine "transition" region. Such 
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chimeric viruses can now be used to screen and evaluate antisense, ribozyme, and other 
therapeutics targeted against this conserved HCV RNA element that is essential for 
replication. 

5 Materials and Methods 

Plasmid Constructs 

pACNR/BVDV NADL was previously described (Mendez et al., 1998, supra). 
pBVDV is a derivative of pACNR/BVDV NADL which contains a G-^T transversion at nt 
14994 that creates an Xba I site upstream of the T7 promoter (T. Myers & CM. Rice, 

10 unpubl.). To facilitate construction of the chimeras, subclones were created. First, two 
fragments were isolated by PGR amplification of p90/HCVFLIongpU (Kolykhalov et al., 
Science 277:570-574, 1997) with primers #498 (5 -TGTAC ATGGC ACGTGCC AGCCCC) 
and #498 (S'-GATCAACTCCATGGTGCACGGTCT) and pBVDV with primers #481 (5*- 
AGACCGTGCACCATGGAGTTGATC) and #482 (5'- 

1 5 CGTTTCACACATGGATCCCTCCTC). These two fragments were digested with ApdL I 
and ligated to produce a fragment containing a fusion of the HCV 5 ( NTR to the BVDV ORF. 
This fragment was digested with Sacl and ligated into pGEM3Zf(-) which had been digested 
with Sma I and Sac I to produce the subclone pGEM498-Sacl. Next, a fragment containing 
the BVDV 5' NTR was synthesized by PCR amplification of pBVDV with primers #183 (5*- 

20 TTTTCTAGATAATACGACTCACTATAGTATACGAGAATTAGAAAAGGCACTCG) 
and #480 (5 '-GGGGGCTGGC ACGTG CC ATGTAC A) . This fragment was digested with 
Xba I and BsrG I and ligated into pGEM498-SacI digested with the same two enzymes, to 
create the plasmid pGEMXbal-Sacl. pGemXbal-Sacl contains a tandem fusion of the BVDV 
5' NTR, the HCV 5' NTR, and the 5' portion of the BVDV 1SP TO gene. pBVDV + HCV was 

25 created by digesting pGEMXbal-Sacl with Xba I and Sac I and ligating the fragment into 
pBVDV digested with the same two enzymes, and as such pBVDV + HCV contains the T7 
promoter, followed by the entire 385-nt 5 f NTR of BVDV, a GT dinucleotide (nt 386-387), 
the entire 341-nt 5' NTR of HCV (nt 388-728), and the sequence of the BVDV NADL strain 
including the ORF and 3' NTR. Derivatives of pBVDV + HCV containing deletions within 

30 the BVDV 5' NTR and/or the HCV 5 f NTR were created in the subclone pGEMXbal-Sacl, as 
described below, prior to ligation into Sba I- and Sac I-digested pBVDV. For making 
deletions, restrictions sites with non-compatible protruding ends were treated with the 
Klenow fragment of DNA polymerase I prior to ligation. For creation of pBVDV + 
HCVdelB3 (deletion of nt 174-374, inclusive), pGEMXbal-Sacl was digested with ,4/7 II and 

35 BsrG I. For pBVDV + HCVdelB2B3 (deletion of nt 67-374), pGEMXbal-Sacl was digested 
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with^vr II and BsrG I. For pBVDV + HCVdelBlB2B3 (deletion of nt 33-374), pGEMXbal- 
Sacl was digested with SnaB I and BsrG I. For pBVDV + HCVdelB2B3Hl (deletion of nt 
67^3396), pGEMXbal-Sacl was digested with Avr II and Xcm I. For pBVDV + 
HCVdelB2B3HlH2 (deletion of nt 67-513), pGEMXbal-Sacl was digested vtiftiAVR II and 
5 Bsg I. For pB VDV + HCVdelB2B3H3 (deletion of nt 67-374, 5 1 8-704), subclone 
pGEMXbal-SacidelB2B3 was digested with Sma I. pS'HCV was created by digesting 
p90/HCVliongpU with Xba I and Nru I and ligating the fragment into pBVDV + HCV 
digested with the same two enzymes. 

The EMCV plasmid, pEC g , was provided by Ann Palmenberg and is described 

1 0 elsewhere (Hahn et al., J. Virol 69:2697-2699, 1995). p5'EMCV contains the entire 710 nt of 
the 5' NTR of EMCV, followed by the open reading frame of B VDV and the 3' NTR. One 
extra G residue was added between the T7 promoter and the first nucleotide of the EMCV 5* 
NTR to facilitate efficient in vitro transcription. Convenient restriction sites within the 
BVDV 5* NTR or the EMCV 5' NTR were used to create additional chimeras. Sites with 

15 noncompatible protruding ends were treated with the Klenow fragment of DNA polymerase I 
prior to ligation. For example, the plasmid pBVDV + EMCVdelA contains nt 1-378 of 
BVDV 5' NTR fused with nt 45-710 of EMCV (the BsrG I site of BVDV ligated to the EcoR 
V site of EMCV), pBVDV + EMCVdelB3A contains nt 1-173 of BVDV fused with nt 45-710 
of EMCV (the Afl II site of BVDV ligated to the EcoR V site of EMCV). pBVDV + 

20 EMCVdelB2B3 A contains nt 1 -66 of BVDV fused with nt 45-7 1 0 of EMCV (the Avr II site 
of BVDV ligated to the EcoR V site of EMCV). pBVDV + EMCVdelB3 ABC contains nt 1- 
173 of BVDV fused with nt 161-710 of EMCV (the Afl II site of BVDV ligated to the 
Psp\405 site of EMCV). pBVDV + EMCVdelB2B3ABC nt 1-66 of BVDV fused with nt 
1 6 1 -7 1 0 of EMCV (the Avr II site of BVDV ligated to the Psp 1406 site of EMCV). pB VDV 

25 + EMCVdelB3A-H contains nt 1-101 of BVDV fused with nt 289-710 of EMCV (the Nhe I 
site of BVDV ligated to the Avr II site of EMCV). pBVDV + EMCVdelB2B3A-H contains 
nt 1-62 of BVDV fused with nt 289-710 of EMCV (the Avr II site of BVDV ligated to the Avr 
II site of EMCV). The schematics of the chimeric 5' NTRs are presented in Figures 2 and 4. 
All other heterologous 5 1 NTRs used in the study were generated by PCR using an 

30 oligonucleotide complementary to nt256-272 of the HCV 5' NTR and primers containing the 
sequence of the Xba I restriction site followed by the T7 promoter, the heterologous 
sequences found in sequenced pseudorevertants, or sequences corresponding to different 
regions of the HCV 5 ! NTR. AH the fragments were subcloned into the plasmid, pRS2 (a 
derivative of pUC19), sequenced, and recloned into the pS'HCV plasmid by replacing the 
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fragment between the XBa I site located upstream of the T7 promoter and the Nhe I site (nt 
249-254) in the 5 r NTR of HCV. 
Cell cultures 

MDBK cells were obtained from M. Collett (ViroPharrna, Inc.) and BT cells were 
5 obtained from the American Type Culture Collection (Rockville, Maryland). Cells were 

grown in Dulbecco ! s modified Eagle medium (D-MEM) supplemented with 10% horse serum 
and sodium pyruvate. 
Transcriptions and transfections 

All the designed plasmids , including pBVDV and the chimeric derivatives, were 

10 digested to completion with Sda I (£ye83871), purified by phenol extraction, precipitated by 
ethanol, and dissolved in water. The transcription reactions were performed sin the T7 
Megascript kit (AMBION) using the conditions recommended by the manufacturer. 
Reactions were incubated at 37°C for 1 h, and 3 H-UTP was added to the reaction to quantify 
the RNA synthesis. The quality of the synthesized RNAs was checked by agarose gel 

15 electrophoresis, and samples containing 50-60% of full-length RNA were used for 

electroporations and in vitro translations. The reaction mixtures were aliquoted and stored at 
-70°C prior to electroporation or in vitro translations. 

Transfection was performed by electroporation of MDBK cells using previously 
described conditions (Mendez et al., 1998, supra). Two micrograms of in vitro synthesized 

20 RNA, corresponding to approximately 1 \i g of the full-length transcript, were used per 
electroporation. In standard experiments, ten-fold dilutions of electroporated cells were 
seeded in 6-well tissue culture plates containing 5 x 10 5 naive MDBK cells per well. After 1 
h of incubation at 37°C in an 5% C0 2 incubator, cells were overlaid with 3 ml of 0.6% LE 
Sea Kem agarose (FMC Bioproducts) containing minimal essential medium supplemented 

25 with 5% horse serum. Plaques were stained with crystal violet after 3 days incubation at 
37°C. The rest of the transfected cells was seeded into 100-mm dishes and incubated for 
approximately 48 h or until cytopathic effect was observed in virtually all cells. Samples of 
the media were taken at 24 and 48 h, and virus titers were determined as described above and 
previously (Mendez et al., 1998, supra). 

30 Analysis of the 5 f ends of viral genomes 

Sequencing of the 5* ends of selected variants of BVDV was performed on plaque- 
purified viruses. Plaques were typically isolated from the agarose overlay without staining 
with neutral red. Virus was eluted in 1 ml of D-MEM/ 10% horse serum for several hours and 
was used to infect 5 x 10 s MDBK cells in 35-mm dishes. After 1 h of virus adsorption of 37 
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°C, an additional 1 ml of D-MEM/10% horse serum was added to the dishes, and incubation 
was continued for 36-48 h until cytopathic effect was observed in virtually all cells. 

Fifty microliters of harvested viral stocks were clarified by low speed centrifugation, 
and viral RNAs were isolated by TRIzol reagent (Gibco-BRL) using the protocol 
5 recommended by the manufacturer. Sequencing of the 5' termini was performed using an 
oligonucleotide/cDNA-ligation strategy described elsewhere (Troutt et al., Proc, Natl. Acad. 
Sci. USA 59:9823-9825, 1992). The primer SI (5 '-GTCGTTTC AC ACATGGATCC), 
complementary to nt 710-729 of the BVDV genome, was used for cDNA synthesis. A 
phosphorylated oligonucleotide tag (5 ! -GACTGTTGTGGCCTGCAGGGCCGAATT) with an 
10 amino group on the 3' terminus was ligated to the first strand cDNA (Troutt et al., 1992, 

supra). One tenth of this reaction mixture was used for PCR amplification. The primers for 
PCR amplification were as follows: primer A (5 '-GCCCTGC AGGCC AC AAC AGTC), 
complementary to the tag; primer B (5-TCAGGCAGTACCACAA) complementary to nt 
281-296 of the HCV 5' NTR; and primer C (5 ! -GGAATGCTCGTCAAGAAGACAG), 
1 5 complementary to nt 268-289 of the EMCV 5' NTR. The primer pairs of A + B or A + C 
were used for analysis of the pseudorevertants of 5'HCV and BVDV + HCVdelBlB2B3 or 
5'EMCV, respectively. For the 5'HCV pseudorevertants, one tenth of the ligation mixture 
was used for an additional PCR reaction. This fragment was synthesized using primer SI, 
describe above, and a primer corresponding to nt 147-175 of the HCV genome. Fragments 
20 were purified by agarose gel electrophoresis and cloned into the plasmid pRS2. Multiple 
independent clones were sequenced by the standard dideoxy-mediated chain termination 
methods using the Sequenase version 2.0 DNA Sequencing Kit (USB). 
Cell-free translation 

Cell-free translation reactions were performed in reticulocyte extracts (Promega) 
25 using conditions recommended by the manufacture. Usually 0. 1-1 |ig of the same in vitro 
synthesized RNAs used in transfection experiments were used in 25 |Ltl translation reactions. 
After 45 min of incubation at 30 °C, 2 |il were dissolved in 10 ^1 of sample buffer, and those 
samples were analyzed by sodium dodecyl sulfate PAGE. Labeled proteins were visualized 
by autoradiography of the dried gel. The efficiency of translation was measured using 
30 phosphorimager analysis (Molecular Dynamics) by comparing the radioactivity in the band 
corresponding to the ISP™ protein. In preliminary experiments, an eightfold increase in 
incorporation was observed for translation of 4 ^ig versus 0.4 ^g BVDV transcript RNA. 
Quantitative data were obtained from reactions using subsaturating (0.4 fig) amounts of 
BVDV or BVDV chimera transcript RNAs. 
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Analysis of virus specific RNAs 

The protocols used for radioactive labeling of virus-specific RNAs are described in 
the appropriate figure legends. RNAs were isolated from the cells by using TRIzol reagent as 
recommended by the manufacturer (Gibco-BRL). After denaturation with glyoxal in 
5 dimethylsulfoxide, cellular RNAs were analyzed by electrophoresis in a 1% agarose gel 
containing a 10 mM phosphate buffer. Pieces of the dried gel containing the appropriate 
RNA bands were excised, and their radioactivity measured by liquid scintillation counting. 

Results 

1 0 Features of the BVDV, HCV, and EMCV 5 1 NTRs important for chimera design 

Schematic representations of the proposed secondary structures of the 5 1 NTRs of 
HCV, BVDV, and EMCV are shown, and the location of each IRES is indicated in Figure 1. 
EMCV is a member of the cardiovirus genus within the family Picornaviridae. While not a 
member of the Flaviviridae, EMCV is similar to HCV and BVDV in that it is a positive- 

15 strand RNA virus shown to contain an IRES wi&iin its 5' NTR (Jang et al., J. virol 62:2636- 
2643, 1988). Based on their proposed secondary structures, the HCV IRES and the BVDV 
IRES have been classified as type 3 IRESs, while the EMCV IRES is classified as a type 2 
IRES (Lemon & Honda, Siemin. Virol 5:274-288, 1997). However, these three IRESs as 
well as IRESs from other members of the Flaviviridae and the Picornaviridae have been 

20 proposed to contain a common structural core (Le et al., Virus Genes 12: 135-147, 1996). 

The model for the secondary structure of the 341 -nt HCV 5' NTR has been refined by 
enzymatic and chemical analysis of synthetic transcripts (Brown et al., Nucl. Acids. Res, 
20:5041-5045, 1992; Wang et al., J. Virol 65:7301-7307, 1994; Honda et al., RNA 2:955-968, 
1996; Lima et al., 1997). This element contains four discreet hairpins (referred to here as HI, ' \ 

25 H2, H3 and H4) and a pseudoknot at the base of hairpin H3 (Wang et al, 1995). The 

secondary structure of the 385-nt BVDV 5 1 NTR has not been as extensively studied, but is 
proposed to be similar to that of HCV (Brown et al, 1992) with four discrete hairpins 
(referred to here as Bl\ Bl, B2, and B3) and a pseudoknot at the base of B3 (Rijnbrand et al., 
1997). The secondary structure of the longer (>700 nt) EMCV 5* NTR consists of a series of 

30 hairpins A-M (Duke et al., 1992; Hoffman & Palmenberg, 1996). Recently, a revised model 
of the EMCV 5' NTR suggests moderately different secondary structures for the C and G 
subregions, and significantly different secondary structures for the I-M subregion 
(Palmenberg & Sgro, 1997). 

For HCV, HI is nonessential for IRES function (Reynolds et al., 1995; Rijnbrand et 

35 al., 1995; Honda et al., 1996b; Reynolds et al., 1996; Kamoshita et al., 1997) and its deletion 
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has actually increased translation efficiency in some analyses (Rijnbrand et al., 1995; Honda 
et al., 1996b). Most studies have found that hairpin H2 and H3 and the pseudoknot are 
essential for IRES function (Wang et al., 1993; Rijnbrand et al., 1995; Honda et al., 1996b). 
However, two studies indicate that H2 may not be essential (Tsukiyama-Kohara et al., 1992; 
5 Urabe et al., 1997). The 3' boundary of the HCV IRES is more controversial. The IRES 
clearly extends to the AUG initiation codon. However, some studies indicate that sequences 
affecting the efficiency of translation initiation extend into the ORF (Reynolds et al., 1995; 
Honda et al., 1996a; Honda et al, 1996b; Lu & Wimmer, 1996). By analogy to the HCV 
IRES and the related pestivirus CSFV IRES, the BVDV IRES probably requires hairpins B2 

1 0 and B3 and the pseudoknot for function, with B 1 * and B 1 probably not required for IRES 
activity (Poole et al., 1995; Rijnbrand et al., 1997). For EMCV, hairpins H-L have been 
shown to be required for IRES function in mono T or dicistronic constructs (Jang & Wimmer, 
1990; Duke et al, 1992). The remaining portion of the EMCV 5' NTR is thought to be 
required for RNA replication or unknown steps in viral replication that are important for 

1 5 pathogenesis (Duke et al, 1990; Martin & Palmenberg, 1 996). 

Replacement of the BVDV 5* NTR with the HCV 5 1 NTR results in a large decrease in 
specific infectivity 

Since the BVDV 5' NTR and the HCV 5' NTR are proposed to have similar RNA 
20 secondary structure and functional organization, an experiment was performed to test whether 
the BVDV 5* NTR could be replaced by the HCV 5' NTR. p5' HCV has an exact replacement 
of the BVDV 5* NTR with that of HCV (Fig. 2A) while the coding sequence and 3' NTR of 
p5 f HCV are identical to pBVDV. Positioning of the HCV 5' NTR in such a manner was 
necessary since translation initiation from the HCV IRES begins at or near the AUG start 
25 codon (Honda et al., 1996a; Reynolds et al., 1995; Reynolds et al., 1996; Rijnbrand et al., 

1996). The specific infectivity of 5'HCV RNA synthesized in vitro was compared to that of 
BVDV RNA by transfection of MDBK (bovine kidney) cells (Fig. 2A). The specific 
infectivity of BVDV RNA was approximately 4 x 10 6 plaque forming units (PFU)/ng RNA. 
In contrast, the specific infectivity of 5' HCV RNA was near the limit of detection (30-50 
30 PFU/|ig RNA) and considerable plaque heterogeneity was apparent. These results suggested 
that the HCV 5 f NTR replacement chimera might be incapable of efficient replication and 
plaque formation and that the plaque forming virus observed had arisen by secondary 
mutation(s). Sequence analysis of plaque-purified 5* HCV viruses presented below confirmed 
that the replicating pool of virus contained such pseudorevertants. 
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Next, the in vitro translation efficiency of these two RNAs in rabbit reticulocyte 
extracts was analyzed to test whether the defect in specific infectivity of 5* HCV RNA could 
be attributed to lower translation efficiency. Although the specific infectivity of 5' HCV RNA j 

was reduced -5 logs compared to BVDV RNA, its translation efficiency was only slightly 1 

I 

5 reduced, -twofold (Fig. 3, lane 1 vs. lane 2). The apparent size of the N-terminal cleavage 
product, N pro , was identical for both RNAs, suggesting that translation initiated with the 
correct AUG. These data are consistent with the hypothesis that the BVDV 5' NTR contains 
signals that are required for a step in replication other than translation which are not present in 
the 5' HCV chimera. 

1 0 Given the low specific infectivity of 5 1 HCV RNA, an experiment was performed to } 

test the effect of placing the BVDV 5' NTR sequence upstream of the HCV 5 1 NTR, resulting 
in tandem BVDV and HCV 5* NTRs (called BVDV + HCV). This arrangement actually 
decreased translation efficiency (Fig. 3, lane 14 vs. lane 1) yet restored infectivity (Fig. 2A). 
The plaques produced by BVDV + HCV were also heterogeneous in size, indicating that this 

1 5 virus was unstable. Upon passage, RT-PCR analysis indicated that pseudorevertants had 

indeed arisen in which portions of the BVDV and/or HCV 5' NTRs had been deleted (data not 

shown). These data show that sequences in the BVDV 5* NTR required for virus replication m \ 

can function when placed upstream of a functional HCV IRES driving translation of the 

BVDV polyprotein. 

20 

Hairpins Bl 1 and Bl in conjunction with the HCV IRES are sufficient for stable and 
efficient BVDV replication 

The sequences within the BVDV 5 f NTR that restored replication in the context of the 1 
HCV 5' NTR were mapped using three deletion variants. The deletion BVDV + HCVdelB3 ^ 

25 removed a large portion of hairpin B3; the deletion within BVDV + HCVdelB2B3 removed 
hairpins B2 and B3, and the deletion within BVDV + HCVdelBlB2B3 removed hairpins Bl , 
B2 and B3. The specific infectivities of RNAs from these deletion mutants were near that of 
BVDV RNA (Fig. 2). Upon passage of these viruses, RT-PCR analyses and sequencing 
indicated that BVDV + HCV delB3 and BVDV + HCVdelB2B3 were stably propagated and 

30 produced homogeneous plaques slightly smaller than those of wild-type BVDV (data not 
shown). In contrast, BVDV + HCVdelBlB2B3 produced smaller heterogeneous plaques. 
Reverse transcription-polymerase chain reaction (RT-PCR) analysis and sequencing indicated 
that BVDV + HCVdelBlB2B3 underwent a reversion event described in more detail below. 
The translation efficiencies of these three RNAs (Fig. 3, lanes 9, 10, and 12) were similar to 

35 BVDV + HCV RNA (Fig. 3, lane 14), indicating that the deleted portions (hairpins Bl, B2, 
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and B3) are not required for translation in the B VDV + HCV chimera. These results show 
that BT and Bl are the minimal elements sufficient for stable replication in conjunction with 
the HCV 5' NTR. 

Having shown that BT and Bl are sufficient for replication in conjunction with the 
5 HCV 5' NTR, we next conducted a deletion analysis to determine the sequences within the 
HCV 5' NTR of BVDV + HCV delB2B3 required for replication. A large portion of HI was 
deleted in BVDV + HCV delB2B3Hl, while both HI and H2 were deleted in BVDV + HCV 
delB2B3HlH2. Of these two RNAs, only BVDV + HCV delB2B3Hl was as infectious as 
parental BVDV RNA (Fig. 2B). However, the BVDV + HCV delB2B3Hl virus produced 

10 smaller plaques than BVDV + HCV delB2B3, indicating that hairpin HI may augment 
replication of the chimera. In contrast, BVDV + HCV delB2B3HlH2 RNA was not 
infectious (Fig. 2B) and was translated poorly (Fig. 3, lane 1 1). Diminished HCV IRES 
activity might be due to deletion of hairpin H2 or juxtaposition of BVDV hairpins BT and Bl 
with H3. A third derivative of BVDV + HCV delB2B3, with a Sma l-Sma I deletion 

1 5 abrogating HCV IRES function by removing H3, was also not infectious (data not shown). 
Thus, a 5 f NTR consisting of Bl' and Bl and a functional HCV IRES is sufficient for stable 
BVDV replication in MDBK cells. Similar results were obtained in BT cells, another BVDV- 
permissive continuous bovine cell line (data not shown). 

20 Replacement of the BVDV 5 f NTR with the EMCV 5' NTR 

The following experiment was performed to determine whether the BVDV 5' NTR 
could be replaced by the 5' NTR of a more phylogenetically distant virus, EMCV. A 
derivative of BVDV was created, called 5' EMCV, that contains an exact replacement of the 
BVDV 5 ! NTR with the EMCV 5 1 NTR plus an additional guanosine residue at the 5' terminus 

25 for more efficient transcription initiation of T7 polymerase (Fig. 4A). The specific infectivity 
of 5' EMCV RNA was more than three orders of magnitude lower than BVDV RNA, 
indicating that it was defective for replication, although its specific infectivity was higher than 
that of 5* HCV RNA (compare Figs. 4A and 2A). Similar to 5' HCV, 5' EMCV produced 
heterogeneous plaques, and sequence analysis indicated that pseudorevertants had arisen. The 

30 lower specific infectivity of 5* EMCV RNA was not likely because of a defect in translation, 
since the translation efficiency of 5 1 EMCV RNA was about threefold higher in vitro than that 
of BVDV RNA (Fig. 3, lane 20 vs. lane 19). 

Similar to BVDV + HCV, it was also determined whether the BVDV 5' NTR at the 5* 
end of the 5' EMCV RNA would increase its specific infectivity. BVDV + EMCVdelA (Fig. 

35 4 A) contained the entire BVDV 5' NTR in tandem with the EMCV 5' NTR lacking a portion 
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of hairpin A. BVDV + EMCVdelA RNA had a specific infectivity near that of BDVD RNA 
(compare Figs. 4A and 2A) despv -.aving a lower translation efficiency than 5' EMCV (Fig. 
3, lane 21 vs. lane 20). Similar 10 the results with BVDV + HCV, this implicates the added 
BVDV 5 1 NTR sequence for a step in viral replication other than translation. Two derivatives 
5 of BVDV + EMCVdelA that contain deletions of portions of the BDVD 5* NTR but maintain 
the sequence of Bl' and Bl, BDVD + EMCVdelB3A and BVDV + EMCVdelB2B3A (Fig. 
4 A), also were infectious. These derivatives had translation efficiencies near that of the 
parental BVDV + EMCVdelA (Fig. 3, compare lanes 15 and 16 with lane 21). This 
demonstrated that hairpins BT and Bl were sufficient for replication in conjunction with a 

1 0 large portion of the EMCV 5' NTR. Derivatives of BVDV + EMC VdelB3 A or BVDV + 
EMCVdelB2B3A that contain further deletions of EMCV (BVDV _ EMCVdelB3ABC and 
BVDV + EMCVdelB2B3ABC in particular) were translated efficiently (Fig. 3, lanes 17 and 
18) and were infectious (Fig. 4B). This indicates that the chimeras did not require putative 
EMCV RNA replication signals (Martin & Palmenberg, 1996). However, derivatives with 

1 5 deletions extending into the canonical EMCV IRES were not infectious. For example, BVDV 
+ EMCVdelB3A-H and BVDV + EMCVdelB2B3A-H, in which a portion of hairpin H is 
deleted, were not infectious (Fig. 4B) and were inefficiently translated in vitro (Fig. 3; lanes 
22 and 23). It should be noted that all of the BVDV + EMCV chimeras produced plaques of 
heterogeneous size, indicating some instability. 

20 

Relatively simple 5 T NTR mutations are observed in adapted pseudorevertants 

As mentioned previously, BVDV + HCVdelBlB2B3 did not replicate stably as 
indicated by the heterogeneity in the size of plaques produced by this virus. Upon passage 
and selection of medium plaque-producing variants, 5 f RACE analysis and sequencing * \ 

25 indicated that nt 1-26 had been deleted in the pseudorevertants, removing a large portion of 
BT which was apparently deleterious in the absence of Bl. This deletion results in the 5' 
terminal sequence 5'GUAUCG which is identical to the first six bases of BVDV genome 
RNA (Fig. 5) and is repeated at positions 27-32. 

Analysis of the passaged 5 ! EMCV virus indicated that the replicating progeny had 
30 also undergone a simple deletion of sequence at the 5' end to generate more efficiently 

replicating variants (Fig. 5). After electroporation, the 5 f EMCV virus pool was passaged 5 
times at a multiplicity of infection of 0.1-1 PFU/cell on MDBK or BT cells, and the 5' termini 
of three randomly picked plaques were sequenced. For all three plaques selected, nt 2-209 
/ had been deleted, again creating a genome RNA with the 5' terminal tetranucleotide sequence 
35 { 5 f -GUAU. 
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Analysis of the 5' HCV progeny indicated that more complicated variants had arisen. 
Most small plaque-producing variants were unstable and quickly reverted to medium plaque- 
producing variants. However, one small plaque-producing variant and two stable medium 
plaque-producing variants were isolated. 5* terminal sequences of the variants were amplified 
5 by rapid amplification of cDNA ends (RACE) and cloned into a plasmid vector, and 

sequences for several independent colonies were determined. The sequence of three clones of 
the small plaque-producing virus (5'HCV.Rl) contained a deletion of HCV sequence from nt 
1-34 and an addition of the dinucleotides 5'-AU in two clones and 5-GU in the third clone. 
This creates a 5* terminus of 5'-(G/A) UAA (Fig. 5B), reminiscent of the first three bases of 

10 the BVDV genome RNA (5-GU A). Both medium plaque variants appeared to have arisen by 
RNA recombination with non-viral sequences (Fig. 5). One medium plaque variant (5' 
HCV.R2) had deleted the first 21 bases of the HCV sequence and contained instead a 
heterologous sequence of 22 bases. BLAST searches revealed a perfect match between this 
sequence and a sequence in a human retina cDNA of unknown function (Tsp509I). The 

1 5 second medium plaque variant (5* HCV.R3) had also undergone a possible recombination 
event leading to the addition of 12 nt to the 5' end of the HCV sequence. Given its short 
length, multiple matches were found in the database with this sequence. As for the smlL^. 
plaque variant, sequencing of multiple clones revealed heterogeneity oat the .extreme 5 f end, ^ \ 
with either G of A identified as the 5 ! base. Remarkably, for both medium plaque variants, 

20 the fused heterologous sequence began with the tetranucelotide sequence 5'-(G/A) UAU (Fig. 
5B). For all three variants, sequencing of the entire 5' NTR and a portion of the N pro coding 
region revealed only these changes at the 5 ! termini. 

5 r NTR sequence changes are sufficient for the pseudorevertant phenotypes 

25 To assess the importance of these alterations oat the 5' terminus of the 5' HCV 

pseudorevertants, derivatives of 5* HCV were created with the changes determined by 5 ! 
RACE (Fig. 6A) and analyzed the specific infectivities of these RNAs (Fig. 6B). 
Corresponding to the small plaque variant, a derivative called 5 ! HCV.R1 orig was engineered 
which contained a 5* NTR consisting of the dinucleotide 5* -GU at the 5' terminus of HCV nt 

30 35-341 . This results in a 5* terminus consisting of 5'-GUAA. 5'HCV.Rl orig RNA had a 
specific infectivity at least four orders of magnitude higher than 5' HCV RNA (Figs. 6B and 
2A). This demonstrates that this 5 f NTR structure is sufficient for phenotypic reversion to 
high specific infectivity. However, small plaques and considerable heterogeneity were 
observed for 5 f HCV.Rl orig suggesting that additional mutations may be present in the 

35 original small plaque variant. 
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The engineered derivative 5'HCV.R2orig had a 5' NTR consisting of 22 nt of 
Tsp509I-homologous sequence followed by HCV nt 22-341. Another construct, called 
:, HCV.R3orig was made, which has the 12 nt of the other heterologous sequence fused to the 
intact HCV 5' NTR. Specific infectivities for both these derivatives were essentially the same 
as observed for wild type BVDV RNA (2-4 x 10 6 PFU/ug; Fig. 6B). Transfection with these 
transcripts produced medium plaques, as observed for the original variants, and this 
phenotype was stable upon passaging. These results show that the altered 5 V NTR sequences 
were responsible for the pseudorevertant phenotypes rather than changes elsewhere in their 
genomes. 



10 



Addition of the tetranucleotide sequence 5'-GUAU to the HCV 5' NTR allows efficient 
15 BVDV replication 

For all three 5' HCV variants studied, as well as the BVDV + HCV delBlB2B3 and 
5'EMCV pseudorevertants, 5' NTR alterations seemed to involve creation of a three- or four- 
base "consensus" sequence identical to the 5' terminus of BVDV genome RNA. To test the 
importance of this sequence, as opposed to fused heterologous sequences, we created a set of 

20 variants with the BVDV 5' tetranucleotide sequence linked to the HCV 5' NTR or the 
deletion/recombinant break points identified during sequence analysis of the 5' HCV 
pseudorevertants (Fig. 6A). 5' HCV.Rlcons had the tetranucleotide sequence 5'-GUAU fused 
to HCV nt 35-341. 5'HCV.R2cons had the 5'-GUAU tetranucleotide sequence fused to HCV 
nt 22-341. 5'HCV.R3cons contained the tetranucleotide sequence 5'-Guau fused to the intact 

25 5' terminus of the HCV NTR. RNAs from all three of these derivatives had specific 
infectivities more than five orders of magnitude higher than 5'HCV and comparable to 
parental BVDV (Fig. 6B). 

There were, however, significant differences between the phenotypes of some of 
these derivatives versus the reconstructed pseudorevertants. As mentioned above, 

30 5'HCV.Rl orig yielded tiny and small plaques and produced low virus yields even after 48 h. 
In contrast, the addition of four bases rather than two bases (5"-GUAU vs. 5'-GU) yielded 
virus with near wild-type plaque morphology (Fig. 6B) and growth Rates (Fig. 7). In the case 
of the smaller deletion, 5'HCV.R2orig and 5'HCV.R2cons were indistinguishable, suggesting 
that, other than the 5' four bases, the fused heterologous sequences were dispensable. This 

35 was not he case, however, for the chimera containing the 5'-GUAU tetranucleotide sequence 
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fused to the intact HCV 5' NTR. 5'HCV.R3cons produced small plaques (Fig. 6B) and grew 
more slowly than 5'HCV.R3orig (Fig. 7) suggesting that the sequence/structure of the 
sequences downstream of the 5 f four bases can affect replication efficiency. 

5 The tetranucleotide sequence 5 f -GUAU is important for efficient BVDV RNA 
accumulation 

Next, the effects of the different 5* termini on virus-specific RNA accumulation 
directly after transfection were analyzed. This allowed a direct comparison between 5 ! HCV 
and the reconstructed pseudorevertants as well as selected BVDV + HCV deletion constructs. 

10 MDBK cells were transfected with in vitro synthesized RNAs and labeled for 10 h beginning 
at 5 h post-transfection with 3 H-UTP in the presence of actinomycin D (Fig. 8). RNA 
replication of the 5' HCV chimera was severely impaired to a level below detection (Fig. 8, 
lane 2). In contrast, every 5 1 NTR alteration of 5? HCV that increased RNA specific 
infectivity and allowed efficient virus growth led to readily detectable viral RNA 

15 accumulation. Addition of BT and Bl to the 5" terminus of the HCV 5' NTR restored RNA 
replication to a level -50% of that observed for BVDV (BVDV + HCVdelB2B3; Fig. 8, lane 
3 vs. lane 1). BVDV + HCVdelB2B3Hl displayed reduced RNA synthesis compared to 
BVDV + HCVdelB2B3 (Fig. 8, lane 4 vs. lane 3) perhaps explaining its small plaque 
phenotype and suggesting a possible positive role for HI in replication of this chimera. 

20 5'HCV.Rlorig, which had exhibited plaque heterogeneity and slow growth, accumulated less 
RNA when compared to 5'HCV.Rlcons (Fig. 8, lane 5 vs. lane 6). 5'HCV.R2orig and 
5'HCV.R2cons showed similar RNA accumulation (Fig. 8, lane 9 vs. lane 10) consistent with 
their medium plaque phenotypes; and 5 f HCV.R3cons exhibited reduced RNA synthesis 
compared to 5 f HCV.R3orig (Fig. 8, lane 8 vs. lane 7), consistent with their small-versus 

25 medium-plaque phenotypes. 

Although these RNA phenotypes are complex, the most striking result is that addition 
of the BT Bl hairpins, addition of heterologous 5' sequences terminating with S'-GUAU or 
simply fusion of this tetranucleotide sequence with the HCV 5' NTR or short 5' truncations of 
the HCV 5' NTR all dramatically upregulated RNA accumulation. This occurred without 

30 increasing translation efficiency, at least as measured in a cell-free assay (Fig. 3, compare 
lanes 3-8 to lane 1), suggesting that these sequences function at the level of RNA replication 
or stability. 
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Discussion 

The work presented here helps to define the requirements for a functional BVDV 
STMTR. The BVDV-specific 5 ! NTR sequences required for efficient replication in cell 
culture are minimal and consist of the 5' terminal sequence, 5'-GUAU. The sequence 5'- 
5 AUAU, detected for some pseudorevertants, may also be functional but this was not tested for 
technical reasons. This simple S'-terminal tetranucleotide sequence, which is conserved 
among pestivirses (Ruggli et al., 1996; Becher et al., 1998), was shown to function in the 
context of functional IRES elements derived from the hepacivirus HCV or the picornavirus 
EMCV. As discussed below, this may indicate that the 5* signals required for BVDV RNA 

10 replication are rather simple or that elements in these heterologous IRESs can functionally 
replace deleted BVDV sequences. 

Sequences at the extreme 5* end of BVDV genome RNA could modulate the 
efficiency of RNA accumulation by affecting RNA stability, translation, promoter efficiency, 
or some combination of these processes. At this time, we can not distinguish among these 

1 5 possibilities but favor an effect on RNA replication. The complement of the BVDV 5' 

sequence at the 3' end of the negative-strand RNA presumably functions in the initiation of 
positive-strand RNA synthesis. Thus, AUAC-3' at the 3 f terminus fo minus-strand RNA may 
be important for positive-strand RNA synthesis. Interestingly, for some positive-strand RNA 
viruses such as rubella virus (Pugachev & Frey, 1998), flock house virus (Ball, 1994) and 

20 turnip crinkle virus (Guan et al., 1997), only minimal exacting sequences at the 3 f termini of 
negative-strand RNAs are required positive-strand RNA synthesis. In contrast to the 5 ! NTR 
replacements, we were unable to generate replication-competent BVDV-HCV replacing that 
of BVDV (data not shown). This may indicate that the signals within the pesti virus 3 f NTR 
required for initiation of negative-strand RNA synthesis are more complex and virus specific. 

25 Once the replication complex has assembled at the 3' NTR and transversed the RNA during 
negative-strand synthesis, the requirements of the 5* NTR for initiation of positive-strand 
synthesis may be minimal. 

Although the RNA replication signals within the 5' NTR appear to be rather simple, it 
is possible that the signals important for RNA replication actually extend into the IRES and 

30 are more complicated. For instance, the 5'HCV pseudorevertants were more stable and grew 
to higher titers than the S'EMCV counterparts, despite the fact that the 5 f EMCV RNAs were 
translated more efficiently in vitro. This may indicate that the BVDV and HCV IRESs 
contain signals important for RNA synthesis that are absent in the EMCV IRES. 

It is perhaps not surprising that 5 1 HCV appeared to recombine with cellular mRNAs 

35 to acquire a 5 1 terminus with the 5' -(G/A) UAU consensus, given that non-cytopathic strains 
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of BVDV can recombine with BVDV RNA or cellular mRNAs to. generate cytopathic strains 
of BVDV (Meyers & Thiel, 1996). Presumably, this recombination event involves template 
switching during negative-strand RNA synthesis, as observed for polio-virus (Kirkegaard & 
Baltimore, 1986). In contrast to 5* HCV, simple deletions of 5* terminal viral sequences could 
5 account for the BVDV + HCVdelBlB2B3 and 5'EMCV pseudorevertants since the 

tetranucleotide sequence is present in these 5 1 NTRs upstream of functional IRES elements. 
Such deletions could occur by partial degradation of positive-strand template prior to 
negative-strand synthesis, by premature termination during negative-strand RNA synthesis, or 
by degradation of 3* terminal negative-strand sequence after synthesis. It is proposed that 

1 0 5 ! HCV was forced to recombine with cellular sequences because HCV does not have an 5'- 
(G/A) UAU sequence upstream of its IRES. The first occurrence of an (G/A)UAUA 
tetranucleotide sequence is at nt 94-97 within hairpin H2, and a 5* deletion extending into this 
sequence would presumably inactivate or severely impair HCV IRES activity. It is interesting 
that BVDV + HCVdelBlB2B3 and 5'EMCV pseudorevertants were generated at much higher 

1 5 frequency than 5 f HCV pseudorevertants. This may indicate that recombination between 

BVDV and cellular RNAs is a rare event compared to the processes which lead to deletion of 
terminal viral sequences. 

Poliovirus chimeras dependent upon a functional HCV IRES have been reported (Lu 
& Wimmer, 1996). Interestingly, viable poliovirus chimeras were produced only when HCV 

20 sequences included both the IRES and the N-terminal portion of the HCV ORF. Nucleotide 
sequences or structures in the downstream ORF can modulate HCV IRES translational 
efficiency (see Reynolds et al., 1995; Honda et al., 1996a) but it was also suggested that the 
N-terminal portion of the HCV core polypeptide might be involved. In the case of our 5' 
HCV pseudorevertants, there is no requirement for HCV C protein sequences. Although the 

25 translation efficiency of the HCV IRES in the presence of additional HCV sequences 3' to the 
AUG start was not directly assessed, the HCV chimeras and pseudorevertants were 
translationally active and infectious in the absence of any portion of the HCV ORF. This 
indicates that either the HCV IRES does not extend into the HCV ORF or that the BVDV 
ORF contains analogous sequence which functions in our 5'HCV chimeras. There is some 

30 limited identity between HCV and BVDV within this region. For example, HCV nt 359-394 
and BVDV nt 405-440 are identical at 21 of 36 positions, although identity within this 
sequence may be attributed to a high adenosine content It is interesting to note that the 
luciferase (LUC) and chloramphenicol acetyl transferase (CAT) reporter genes previously 
used to detect HCV IRES activity (Tsukiyama-Kohara et al., 1992; Wang et al., 1993) also 

35 have adenosine- or purine-rich regions in relatively the same position as the HCV ORF and 
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BVDV ORF. It this region is indeed important for IRES activity, this may explain why some 
have observed that the HCV IRES does not require a portion of the HCV ORF for translation 
of CAT or LUC (Tsukiyama-Kohara et al., 1992; Wang et al., 1993). Point mutations and 
insertions within this region of HCV have been shown to reduce HCV IRES activity in vitro 
5 (Honda etal., 1996a,b). 

Despite the fact that BT and Bl are conserved among different strains of BVDV and 
similar hairpins are present in border disease virus and CSFV (Deng & Brock, 1993; Becher 
et al, 1998), BT and Bl were dispensable for BVDV replication, provided that the 5' 
tetranucleotide sequence 5'-(G/A)UAU remained. This may indicate a role for B V and B 1 in 

10 viral replication in vivo that we do not observe in cell culture. It will be interesting to test the 
phenotype of chimeras that lack BT and Bl in vivo to determine if they are attenuated and 
might serve as useful BVDV vaccines. In this vein, several studies with flavi viruses have 
demonstrated that alterations in 5 ! NTR or 3* NTR elements can lead to attenuation in vivo 
(Cahour et al., 1995; Men et a., 1996; Mandl et al., 1998). BVDV chimeras that utilize the 

1 5 HCV or EMCV IRES may also prove to be attenuated simply due to the presence of the 

heterologous IRES. For poliovirus, it has been shown that differences in IRES efficiency in 
different host-cell environments can modulate host range and virulence (Shiroki et al., 1997). 

BVDV-HCV chimeras that are dependent on a functional HCV IRES may have 
another practical application. It may be possible to use these chimeras to screen for anti-HCV 

20 therapeutics that target the HCV IRES. Other researchers have shown antisense 

oligonucleotide-mediated inhibition of HCV gene expression in hepatocytes by targeting the 
oligonucleotides to the HCV IRES (Hanecak et al., 1996). It will be of interest to measure the 
efficacy of antisense oligonucleotides or ribozymes (Lieber et al., 1996) against replicating 
virus, and these chimeras are more useful than HCV for this purpose since they are able to / « 

25 replicate efficiently in cell culture. BVDV is believed to be a reasonable model of HCV 
replication not only because of homology and conserved motifs within the 5* NTR but also 
because of similarities in overall genetic organization (Rice, 1996) and polyprotein processing 
strategy (Tautz et al., 1997; Xu et al, 1997). 

In view of the above, it will be seen that the several advantages of the invention are 

30 achieved and other advantageous results attained. 

As various changes could be made in the above methods and compositions without 
departing from the scope of the invention, it is intended that all matter contained in the above 
description and shown in the accompanying drawings shall be interpreted as illustrative and 
not in a limiting sense. 
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All references cited in this specification, including patents and patent applications, are 
hereby incorporated by reference. The discussion of references herein is intended merely to 
summarize the assertions made by their authors and no admission is made that any reference 
constitutes prior art. Applicants reserve the right to challenge the accuracy and pertinency of 
the cited references. 
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What is Claimed is: 

1 . A polynucleotide comprising a chimeric viral RNA which comprises: 

(a) a 5 ' nontranslated region (5 ' NTR); 

(b) an open reading frame (ORF) region; and 

(c) a 3' nontranslated region (3' NTR); 

wherein at least one of said regions is chimeric and comprises a first nucleotide sequence 
from a pestivirus in operable linkage with a first nucleotide sequence from an hepatitis C 
virus (HCV), and wherein said chimeric viral RNA is replication-competent. 

2. The polynucleotide of claim 1 , wherein the chimeric region is the 5 ' NTR and 
the first pestivirus nucleotide sequence is from a bovine viral diarrhea virus (BVDV). 

3. The polynucleotide of claim 2, wherein the BVDV nucleotide sequence is 
located at the 5' terminus of the chimeric 5' NTR and comprises 5' RUAU. 

4. The polynucleotide of claim 3, wherein the first HCV nucleotide sequence in 
the chimeric 5 ' NTR comprises an internal ribosome entry site (IRES). 

5. The polynucleotide of claim 4, wherein the ORF and the 3' NTR consist of 
second and third BVDV sequences. 

6. The polynucleotide of claim 5, wherein the 5' terminal sequence comprises 5' 

GUAU. 

7. The polynucleotide of claim 4, wherein the ORF comprises a second HCV 
sequence encoding at least one structural protein operably linked to a second BVDV 
sequence. 

8. The polynucleotide of claim 1 , wherein the pestivirus is BVDV and the 
chimeric region is the 3' NTR. 

9. The polynucleotide of claim 8, wherein the first HCV sequence in the 
chimeric 3 ' NTR comprises the HCV 98 bp 3 ' terminal element (SEQ ID NO:X) operably 
linked to the first BVDV sequence. 
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10. A method for identifying compounds having antiviral activity against 
hepatitis C virus (HCV) comprising the steps of: 

(a) providing a first cell containing a chimeric viral RNA which is replication- 
competent in the cell, the chimeric viral nucleic acid comprising a 5' nontranslated region (5' 

5 NTR), an open reading frame (ORF) region; and a 3 ' nontranslated region (3 ' NTR); 
wherein at least one of said regions is chimeric and comprises a first nucleotide sequence 
from a pestivirus in operable linkage with a first nucleotide sequence from an hepatitis C 
virus (HCV); 

(b) providing a second cell containing the pestivirus; and 

1 0 (c) comparing the replication efficiency of the chimeric viral RNA acid in the 

presence and absence of a test compound to the replication efficiency of the pestivirus in the 
presence and absence of the test compound, 

wherein a greater reduction in compound-induced replication efficiency of the chimeric viral 
RNA than the pestivirus indicates the compound has anti-HCV activity. 

15 

1 1 . The method of claim 10, wherein the chimeric region is the 5 ' NTR and the 
first pestivirus nucleotide sequence is from a bovine viral diarrhea virus (BVDV). 

12. The method of claim 1 1 , wherein the BVDV nucleotide sequence is located 
20 at the 5 ' terminus of the chimeric 5 ' NTR and comprises 5 ' RUAU. 

13. The method of claim 12, wherein the first HCV nucleotide sequence in the 
chimeric 5' NTR comprises an internal ribosome entry site (IRES). 

25 14. The method of claim 13, wherein the ORF and the 3' NTR comprise second 

and third sequences from the BVDV. 

15. The method of claim 10, wherein the pestivirus is BVDV and the chimeric 
region is the 3 ' NTR. 

30 

16. A genetically-engineered virus comprising a chimeric RNA genome which 
comprises: 

(a) a 5' nontranslated region (5' NTR); 

(b) an open reading frame (ORF) region; and 
35 (c) a 3 ' nontranslated region (3 ' NTR); 
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wherein at least one of said regions is chimeric and comprises a first nucleotide sequence 
from a pestivirus in operable linkage with a first nucleotide sequence from an hepatitis C 
virus (HCV), and wherein said chimeric RNA genome is replication-competent. 

5 17. The genetically-engineered virus of claim 16, wherein the chimeric region is 

the 5' NTR and the first pestivirus nucleotide sequence is from a bovine viral diarrhea virus 
(BVDV). 

18. The genetically-engineered virus of claim 16, wherein the BVDV nucleotide 
1 0 sequence is located at the 5 ' terminus of the chimeric 5 ' NTR and comprises 5 ' RUAU and 

the first HCV nucleotide sequence in the chimeric 5' NTR comprises an internal ribosome 
entry site (IRES). 

19. A vaccine against bovine viral diarrhea virus (BVDV) comprising an 

1 5 immunogenically-effective amount of a genetically-engineered virus comprising a chimeric 
RNA genome having: 

(a) a 5' nontranslated region (5' NTR); 

(b) an open reading frame (ORF) region; and 

(c) a 3 ' nontranslated region (3 ' NTR); 

20 wherein at least one of said regions is chimeric and comprises a first nucleotide sequence 
from BVDV in operable linkage with a first nucleotide sequence from an hepatitis C virus 
(HCV), and wherein the genetically-engineered virus is attenuated as compared to BVDV. 

20. The vaccine of claim 19, wherein the chimeric region is the 5 ' NTR and the 
25 BVDV nucleotide sequence is located at the 5 ' terminus of the chimeric 5 ' NTR and 

comprises 5 ' RUAU and the first HCV nucleotide sequence in the chimeric 5 ' NTR 
comprises an internal ribosome entry site (IRES). 

21. A polynucleotide comprising a chimeric viral RNA which comprises: 
30 (a) a 5 ' nontranslated region (5 ' NTR); 

(b) an open reading frame (ORF) region; and 

(c) a 3 ' nontranslated region (3 ' NTR); 

wherein at least one of said regions is chimeric and comprises a first nucleotide sequence 
from a pestivirus in operable linkage with a heterologous nucleotide sequence and wherein 
35 said chimeric viral RNA is replication-competent. 
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pACNR/BVD NADL-Xba* •> Graphic Map 

DNA sequence 15065 bp gtatacgagaat . . . cgactcactata circular 

pACNR/BVD NADL-Xba = Haell and Xhol digest of pACNR/BVD MADL li gated co 

Haell and Xhol digest of pACNR1180/DraIII- /8VDS * 
8/27 corrected nt 12136 G to C to give Hpal site. 

Co 
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pACNR/BVD NADL-Xba* -> Genes 

DMA sequence 15065 b.p. gtatacgagaat ... cgactcaccata circular 

' '^rR/BVD NADL-Xba = Haell and Xhol digest of pACNR/BVD NADL ligated Co 
Haell and Xhol digest of pACNR1180/DraIII-/BVD5 ' 
8/27 corrected nt 12136 G co C to give Hpal site. 

Co 



321 cagcctgatagggtgctgcagaggcccactgtattgctactaaaaatctctgctgtacatggcac ATG GAG TTG 
1 MEL 

395 ATC ACA AAT GAA CTT TTA TAC AAA ACA TAC AAA CAA AAA CCC GTC GGG GTG GAG GAA CC 
4 I T N ELLYKTYKQKPVGVE EP 

455 GTT TAT GAT CAG GCA GGT GAT CCC TTA TTT GGT GAA AGG GGA GCA GTC CAC CCT CAA TC 
24VYDQAGDPLFGERGAVHPQ S 

515 ACG CTA AAG CTC CCA CAC AAG AGA GGG GAA CGC GAT GTT CCA ACC AAC TTG GCA TCC TT 
44 T L K LPHKRGERDVPTNLASL 

575 CCA AAA AGA GGT GAC TGC AGG TCG GGT AAT AGC AGA GGA CCT GTG AGC GGG ATC TAC CT 
64PKRGDCRSGNSRGPVSGIYL 

635 AAG CCA GGG CCA CTA TTT TAC CAG GAC TAT AAA GGT CCC GTC TAT CAC AGG GCC CCG CT 
84 K PG P L F Y QDYKG PVY H RA P L 

695 GAG CTC TTT GAG GAG GGA TCC ATG TGT GAA ACG ACT AAA CGG ATA GGG AGA GTA ACT GG 
104 ELF EEGSMCETTKRIGRVTG 

755 AGT GAC GGA AAG CTG TAC CAC ATT TAT GTG TGT ATA GAT GGA TGT ATA ATA ATA AAA AG 
124SDGKLYK IYVCIDGCI I I KS 

815 GCC ACG AGA AGT TAC CAA AGG GTG TTC AGG TGG GTC CAT AAT AGG CTT GAC TGC CCT CI 
144 A T R S YQRVFRWVHNRLDC PL 

875 TGG GTC ACA ACT TGC TCA GAC ACG AAA GAA GAG GGA GCA ACA AAA AAG AAA ACA CAG Afl 
164 WVTTCSDTKEEGATKKKTQK 

935 CCC GAC AGA CTA GAA AGG GGG AAA ATG AAA ATA GTG CCC AAA GAA TCT GAA AAA GAC AC 
184 P D R LERG KMKIVPKES EKDS 

995 AAA ACT AAA CCT CCG GAT GCT ACA ATA GTG GTG GAA GGA GTC AAA TAC CAG GTG AGG W 
204 KTKPPDATIVVEGVKYQVRK 

1055 AAG GGA AAA ACC AAG AGT AAA AAC ACT CAG GAC GGC TTG TAC CAT AAC AAA AAC AAA CC 
224 KGKTKSKNTQDG LYHNKNKP 

1115 CAG GAA TCA CGC AAG AAA CTG GAA AAA GCA TTG TTG GCG TGG GCA ATA ATA GCT ATA G1 
244 QES RKKLEKALLA WAI I . A I V 

1175 TTG TTT CAA GTT ACA ATG GGA GAA AAC ATA ACA CAG TGG AAC CTA CAA GAT AAT GGG AC 
264 L F Q VTMGENITQWNLQDNGT 

1235 GAA GGG ATA CAA CGG GCA ATG TTC CAA AGG GGT GTG AAT AGA AGT TTA CAT GGA ATC TC 
284 EG I QRAM FQRGVNRS LHG IW 

1295 CCA GAG AAA ATC TGT ACT GGT GTC CCT TCC CAT CTA GCC ACC GAT ATA GAA CTA AAA A< 
304 PEK ICTGVPSHLATDI ELKT 

1355 ATT CAT GGT ATG ATG GAT GCA AGT GAG AAG ACC AAC TAC ACG TGT TGC AGA CTT CAA O. 
324 IHGMMDASEKTNYTCCRLQR 

1415 CAT GAG TGG AAC AAG CAT GGT TGG TGC AAC !X3G TAC AAT ATT GAA CCC TGG ATT CTA C 
344 HEWNKHGWCN..fYNIEPWI LV 

1475 ATG AAT AGA ACC CAA GCC AAT CTC ACT GAG GGA CAA CCA CCA AGG GAG TGC GCA GTC M 
364 MNRTQANLTEGQPPRECAVT 

1535 TGT AGG TAT GAT AGG CCT AGT GAC TTA AAC GTG GTA ACA CAA GCT AGA GAT AGC CCC A 
384 CRY DRASDLNVVTQARDS PT 

1595 CCC TTA ACA GGT TGC AAG AAA GGA AAG AAC TTC TCC TTT GCA GGC ATA TTG ATG CGG G 
404 PLTGCKKGKNFSFAG I LMRG 
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1655 CCC TGC AAC TTT GAA ATA GCT GCA ACT GAT GTA TTA TTC AAA GAA CAT GAA CGC ATT AGT 1714 

424 P C -N F E IAASDVLFKEHER r S 443 

1715 ATG TTC CAG GAT ACT ACT CTT TAC CTT GTT GAC GGG TTG ACC AAC TCC TTA GAA GGT GCC 1774 

444 MFQDTTLYLVDG LTNSLEGA 463 

1775 AGA CAA GGA ACC GCT AAA CTG ACA ACC TGG TTA GGC AAG CAG CTC GGG ATA CTA GGA AAA 1834 

464 RQGTAKLTTWLGKQLGI LGK 483 

1835 AAG TTG GAA AAC AAG AGT AAG ACG TGG TTT GGA GCA TAC GCT GCT TCC CCT TAC TGT GAT 1894 

484 KLENKSKTWFGAY AASPYC D 503 

1895 GTC GAT CGC AAA ATT GGC TAC ATA TGG TAT ACA AAA AAT TGC ACC CCT GCC TGC TTA CCC 1954 

5Q4VDRKIGYIWYTKNCTPACLP 523 

1955 AAG AAC ACA AAA ATT GTC GGC CCT GGG AAA TTT GAC ACC AAT GCA GAG GAC GGC AAG ATA 2014 

524 K NTK I VG PGK FDTNA EDG K I 543 

2015 TTA CAT GAG ATG GGG GGT CAC TTG TCG GAG GTA CTA CTA CTT TOT TTA GTG GTG CTG TCC 2074 

544 LHEMGGHLSEVLLLS LVVLS 563 

2075 GAC TTC GCA CCG GAA ACA GCT AGT GTA ATG TAC CTA ATC CTA CAT TTT TCC ATC CCA CAA 2134 

564 DFAPETASVM YLI LHFSI PQ 583 

2135 AGT CAC GTT GAT GTA ATG GAT TGT GAT AAG ACC CAG TTG AAC CTC ACA GTG GAG CTG ACA 2194 

584 SHVDVMDCDKTQ LNLTVELT 603 

2195 ACA GCT GAA GTA ATA CCA GGG TCG GTC TGG AAT CTA GGC AAA TAT GTA TGT ATA AGA CCA 2254 

604 T A EV I PG SVW N LG KY V C I R P 623 

2255 AAT TGG TGG CCT TAT GAG ACA ACT GTA GTG TTG GCA TTT GAA GAG GTG AGC CAG GTG GTG 2314 

624 NWWPYETTVV LAFEEVSQVV 643 

2315 AAG TTA GTG TTG AGG GCA CTC AGA GAT TTA ACA CGC ATT TGG AAC GCT GCA ACA ACT ACT 2374 

644 KLVLRALRDLTR 'I WNAATTT 663 

2375 GCT TTT TTA GTA TGC CTT GTT AAG ATA GTC AGG GGC CAG ATG GTA CAG GGC ATT CTG TGG 2434 

664 AFLVCLVKIVRGQMVQGI LW 683 

2435 CTA CTA TTG ATA ACA GGG GTA CAA GOG CAC TTG GAT TGC AAA CCT GAA TTC TCG TAT GCC 2494 

684 LLLITGVQGH LDCKPEFSY A 703 

2495 ATA GCA AAG GAC GAA AGA ATT GGT CAA CTG GGG GCT GAA GGC CTT ACC ACC ACT TGG AAG 2554 

704 IAKDERIGQLGAEGLT TTWK 723 

2555 GAA TAC TCA CCT GGA ATG AAG CTG GAA GAC ACA ATG GTC ATT GCT TGG TGC GAA GAT GGG 2614 

724 EYSPGMKLEDTMVIAWCEDG 743 

2615 AAG TTA ATG TAC CTC CAA AGA TGC ACG AGA GAA ACC AGG TAT CTC GCA ATC TTG CAT ACA 2674 

744KLMYLQRCTRE-TRY LAI LHT 763 

2675 AGA GCC TTG CCG ACC AGT GTG GTA TTC AAA AAA CTC TTT GAT GGG CGA AAG CAA GAG GAT 2734 

764 RALPTSVVFKKLFDGRKQE D 783 

2735 GTA GTC GAA ATG AAC GAC AAC TTT GAA TTT GGA CTC TGC CCA TGT GAT GCC AAA CCC ATA 2794 

784 VVEMNDNFEFGLC PCDAKP I 803 

2795 GTA AGA GGG AAG TTC AAT ACA ACG CTG CTG AAC GGA CCG GCC TTC CAG ATG GTA TGC CCC 2854 

804 VRGKFNTTLLNG PAFQMVC P 823 

2855 ATA GGA TGG ACA GGG ACT GTA AGC TGT ACG TCA TTC AAT ATG GAC ACC TTA GCC ACA ACT 2914 

824 IGWTGTVSCTSFNMDTLATT 843 

2915 GTG GTA CGG ACA TAT AGA AGG TCT AAA CCA TTC CCT CAT AGG CAA GGC TGT ATC ACC CAA 2974 

644 VVRTYRRSKP FPHRQGCITQ 863 

2975 AAG AAT CTG GGG GAG GAT CTC CAT AAC TGC ATC CTT. GGA GGA AAT TGG ACT TGT GTG CCT 3034 

864 KNLGEDLHNC I LGGNWTCV P 883 

3035 GGA GAC CAA CTA CTA TAC AAA GGG GGC TCT ATT GAA TCT TGC AAG TGG TGT GGC TAT CAA 3094 

884 GDQLLYKGGS I ESCKWCGYQ 903 

3095 TTT AAA GAG AGT GAG GGA CTA CCA CAC TAC CCC ATT GGC AAG TGT AAA TTG GAG AAC GAG 3154 

904 FKESEGLPHY PIGKCKLENE 923 

3155 ACT GGT TAC AGG CTA GTA GAC AGT ACC TCT TGC AAT AGA GAA GGT GTG GCC ATA GTA CCA 3214 

924 TGYRLVDSTSCNREGVAI V P 943 

3215 CAA GGG ACA TTA AAG TGC AAG ATA GGA AAA ACA ACT GTA CAG GTC ATA GCT ATG GAT ACC 3274 

944 QGTLKCKIGK.TTVQV IAMDT 963 

3275 AAA CTC GGA CCT ATG CCT TGC AGA CCA TAT GAA ATC ATA TCA AGT GAG GGG CCT GTA GAA 3334 

964 KLGPMPCRPYEI ISSEGPVE 983 

3335 AAG ACA GCG TGT ACT TTC AAC TAC ACT AAG ACA TTA AAA AAT AAG TAT TTT GAG CCC AGA 3394 

984 KTACTFNYTKTL KNK Y FE P R 1003 
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3 395 GAC AGC TAC TTT CAG CAA TAC ATG CTA AAA GGA GAG TAT CAA TAC TGG TTT GAC CTG GAG 
1004 DSYFQQYMLKCEYQYWFDLE 

3455 GTG ACT GAC CAT CAC CGG.GAT TAC TTC GCT GAG TCC ATA TTA GTG GTG GTA GTA GCC CTC 
1024 VTDHHRDYFAES I L V V V V A L 

3515 TTG GGT GGC AGA TAT GTA CTT TGG TTA CTG GTT ACA TAC ATG GTC TTA TCA GAA CAG AAG 
1044 LGGRYVLWLLVTYMVLSEQK 

3575 GCC TTA GGG ATT CAG TAT GGA TCA GGG GAA GTG GTG ATG ATG GGC AAC TTG CTA ACC CAT 
1064 ALGIQYGSGEVVMMGNLLTH 

3635 AAC AAT ATT GAA GTG GTG ACA TAC TTC TTG CTG CTG TAC CTA CTG CTG AGG GAG GAG AGC 
1084 NNIEVVTYFLLLY LLLREES 

3695 GTA AAG AAG TGG GTC TTA CTC TTA TAC CAC ATC TTA GTG GTA CAC CCA ATC AAA TCT GTA 
1104 VKKWVLLLYH ILVVHPIKSV 

3755 ATT GTG ATC CTA CTG ATG ATT GGG GAT GTG GTA AAG GCC GAT TCA GGG GGC CAA GAG TAC 
1124 IVILLMIGDVVKADSGGQEY 

3815 TTG GGG AAA ATA GAC CTC TGT TTT ACA ACA GTA GTA CTA ATC GTC ATA GGT TTA ATC ATA 
1144 LGKIDLCFTTVVLIVIGLI I 

3875 GCC AGG CGT GAC CCA ACT ATA GTG CCA CTG GTA ACA ATA ATG GCA GCA CTG AGG GTC ACT 
1164 ARRDPTIVPLVTI MAALRVT 

3935 GAA CTG ACC CAC CAG CCT GGA GTT GAC ATC GCT GTG GCG GTC ATG ACT ATA ACC CTA CTG 
1184 ELTHQPGVDIAVAVMT ITLL 

3995 ATG GTT AGC TAT GTG ACA GAT TAT TTT AGA TAT AAA AAA TGG TTA CAG TGC ATT CTC AGC 
1204 MVSYVTDYFRYKKWLQCI L S 

4055 CTG GTA TCT GCG GTG TTC TTG ATA AGA AGC CTA ATA TAC CTA GGT AGA ATC GAG ATG CCA 
1224 LVSAVFLIRSLIYLGRIEMP 

4115 GAG GTA ACT ATC CCA AAC TGG AGA CCA CTA ACT TTA ATA CTA TTA TAT TTG ATC TCA ACA 
1244 EVTI PNWRPLTLI L L Y L I ST 

4175 ACA ATT GTA ACG AGG TGG AAG GTT GAC GTG GCT GGC CTA TTG TTG CAA TGT GTG CCT ATC 
1264 TIVTRWKVDVAGLLLQCVP I 

4235 TTA TTG CTG GTC ACA ACC TTG TGG GCC GAC TTC TTA ACC CTA ATA CTG ATC CTG CCT ACC 
1284 LLLVTTLWADFLTLILILPT 

4295 TAT GAA TTG GTT AAA TTA TAC TAT CTG AAA ACT GTT AGG ACT GAT ATA GAA AGA ACT TGG 
1304 YELVKLYYLKTVRTDIERSW 

4355 CTA GGG GGG ATA GAC TAT ACA AGA GTT GAC TCC ATC TAC GAC GTT GAT GAG AGT GGA GAG 
1324 LGGIDYTRVDS IYDVDESGE 

4415 GGC GTA TAT CTT TTT CCA TCA AGG CAG AAA GCA CAG GGG AAT TTT TCT ATA CTC TTG CCC 
1344 GVYLFPSRQKAQGNFSI LLP 

4475 CTT ATC AAA GCA ACA CTG ATA AGT TGC GTC AGC AGT AAA TGG CAG CTA ATA TAC ATG AGT 
1364 LIKATLISCVSSKWQLIYMS 

4535 TAC TTA ACT TTG GAC TTT ATG TAC TAC ATG CAC AGG AAA GTT ATA GAA GAG ATC TCA GGA 
1384 YLTLDFMYYMHRKVIEEISG 

4595 GGT ACC AAC ATA ATA TCC AGG TTA GTG GCA GCA CTC ATA GAG CTG AAC TGG TCC ATG GAA 
1404 GTNI ISRLVAALI ELNWSME 

4655 GAA GAG GAG AGC AAA GGC TTA AAG AAG TTT TAT CTA TTG TCT GGA AGG TTG AGA AAC CTA 
1424 EEESKGLKKFYLLSGRLRNL 

4715 ATA ATA AAA CAT AAG GTA AGG AAT GAG ACC GTG GCT TCT TGG TAC GGG GAG GAG GAA GTC 
1444 IIKHKVRNETVASWYGEEEV 

4775 TAC GGT ATG CCA AAG ATC ATG ACT ATA ATC AAG GCC AGT ACA CTG AGT AAG AGC AGG CAC 
1464 YGMPK IMTI I KASTLSKSRH 

4835 TGC ATA ATA TGC ACT GTA TGT GAG GGC CGA GAG TGG AAA GGT GGC ACC TGC CCA AAA TGT 
1484 CIICTVCEGREWKGGTCPKC 

4895 GGA CGC CAT GGG AAC CCG ATA ACG TGT GGG ATG TCG CTA GCA GAT TTT GAA GAA AGA CAC 
1504 GRHGKPI TCGMSLADFEERH 

4955 TAT AAA AGA ATC TTT ATA AGG GAA GGC AAC TTT GAG GGT ATG TGC AGC CGA TGC CAG GGA 
1524 YKRIFIREGNFEGMCSRCQG 

5015 AAG CAT AGG AGG TTT GAA ATG GAC COG GAA CCT AAG AGT GCC AGA TAC TGT GCT GAG TGT 
1544 KHRRFEMDRE PKSARYCAEC 

5075 AAT AGG CTG CAT CCT GCT GAG GAA GGT GAC TTT TGG GCA GAG TCG AGC ATG TTG GGC CTC 
1564 NRLHPAEEGDFWAESSMLGL 
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5135 AAA ATC ACC TAC TTT GCG CTG ATG GAT GGA AAG GTG TAT GAT ATC ACA GAG TGG GOT GCA 5194 

1584 KI.TYFALMDGKVYDITEWAG 1603 

5195 TGC CAG CGT GTG GGA ATC TCC CCA GAT ACC CAC AGA GTC CCT TGT CAC ATC TCA TTT GGT 5254 

1604 CORVG I SPDTHRVPCH I S FG 1623 

5255 TCA CGG ATG CCT TTC AGG CAG GAA TAC AAT GGC TTT GTA CAA TAT ACC GCT AGG GGG CAA 5314 

1624 SRMPFRQEYNG 'FVQYTARGQ 1643 

5315 CTA TTT CTG AGA AAC TTG CCC GTA CTG GCA ACT AAA GTA AAA ATG CTC ATG GTA GGC AAC 5374 

1644 LFLRNLPVLATKVKMLMVGN 1663 

5375 CTT GGA GAA GAA ATT GGT AAT CTG GAA CAT CTT GGG TGG ATC CTA AGG GGG CCT GCC GTG 5434 

1664 LGEEIGNLEHLGWI LRGPAV 1683 

5435 TGT AAG AAG ATC ACA GAG CAC GAA AAA TGC CAC ATT AAT ATA CTG GAT AAA CTA ACC GCA 5494 

1684 CKKITEHEKCHINI LDKLTA 1703 

5495 TTT TTC GGG ATC ATG CCA AGG GGG ACT ACA CCC AGA GCC CCG GTG AGG TTC CCT ACG AGC 5554 

1704 FFGIMPRGTTPRAPVRFPTS 1723 

5555 TTA CTA AAA GTG AGG AGG GGT CTG GAG ACT GCC TGG GCT TAC ACA CAC CAA GGC GGG ATA 5614 

1724 LLKVRRGLETAWAYTHQGGI 1743 

5615 AGT TCA GTC GAC CAT GTA ACC GCC GGA AAA GAT CTA CTG GTC TGT GAC AGC ATG GGA CGA 5674 

1744 SSVDHVTAGKDLLVCDSMGR 1763 

5675 ACT AGA GTG GTT TGC CAA AGC AAC AAC AGG TTG ACC GAT GAG ACA GAG TAT GGC GTC AAG 5734 

1764 TRVVCQSNNRLTDETEY GVK 1783 

5735 ACT GAC TCA GGG TGC CCA GAC GGT GCC AGA TGT TAT GTG TTA AAT CCA GAG GCC GTT AAC 5794 

1784 TDSGCPDGARCYVLNPEAVN 1803 

5795 ATA TCA GGA TCC AAA GGG GCA GTC GTT CAC CTC CAA AAG ACA GGT GGA GAA TTC ACG TGT 5854 

1804 I SG SKGAVVH LQ .KTGG E FTC 1823 

5855 GTC ACC GCA TCA GGC ACA CCG GCT TTC TTC GAC CTA AAA AAC TTG AAA GGA TGG TCA GGC 5914 ^ 

1824 VTASGTPAFFDLKNLKGWSG 1843 O 

5915 TTG CCT ATA TTT GAA GCC TCC AGC GGG AGG GTG GTT GGC AGA GTC AAA GTA GGG AAG AAT 5974 m 

1844 LPIFEASSGRVVGRVKVGKN 1863 7J 

5975 GAA GAG TCT AAA CCT ACA AAA ATA ATG AGT GGA ATC CAG ACC GTC TCA AAA AAC AGA GCA 6034 S 

1864 EESKPTKIMSGIQTVSKNRA 1883 {J 

6035 GAC CTG ACC GAG ATG GTC AAG AAG ATA ACC AGC ATG AAC AGG GGA GAC TTC AAG CAG ATT 6094 ^ 

1884 DLTEMVKKITSMNRGDFKQI 1903 

6095 ACT TTG GCA ACA GGG GCA GGC AAA ACC ACA GAA CTC CCA AAA GCA GTT ATA GAG GAG ATA 6154 

1904 TLATGAGKTTELPKAVI EEI 1923 

6155 GGA AGA CAC AAG AGA GTA TTA GTT CTT ATA CCA TTA AGG GCA GCG GCA GAG TCA GTC TAC 6214 

1924 GRHKRVL.VLI PLRAAAESVY 1943 

6215 CAG TAT ATG AGA TTG AAA CAC CCA AGC ATC TCT TTT AAC CTA AGG ATA GGG GAC ATG AAA 6274 

1944 QYMRLKHPSISFNLR1GDMK 1963 

6275 GAG GGG GAC ATG GCA ACC GGG ATA ACC TAT GCA TCA TAC GGG TAC TTC TGC CAA ATG CCT 6334 

1964 EGDMATGITYASYGYFCQMP 1983 

6335 CAA CCA AAG CTC AGA GCT GCT ATG GTA GAA TAC TCA TAC ATA TTC TTA GAT GAA TAC CAT 6394 

1984 QPK LRAAMVEY S Y I F L D E Y H 2003 

6395 TGT GCC ACT CCT GAA CAA CTG GCA ATT ATC GGG AAG ATC CAC AGA TTT TCA GAG AGT ATA 6454 

2004 CATPEQLAIIGKIHRFS ESI 2023 

6455 AGG GTT GTC GCC ATG ACT GCC ACG CCA GCA GGG TCG GTG ACC ACA ACA GGT CAA AAG CAC 6514 

2024 RVVAMTATPAGS VTTTGQKH 2043 

6515 CCA ATA GAG GAA TTC ATA GCC CCC GAG GTA ATG AAA GGG GAG GAT CTT GGT AGT CAG TTC 6574 

2044 PI EEFIAPEVMKGEDLGSQF 2063 

6575 CTT GAT ATA GCA GGG TTA AAA ATA CCA GTG GAT GAG ATG AAA GGC AAT ATG TTG GTT TTT 6634 

2064 LDIAGLKIPVDEMKGNMLVF 2083 

6635 GTA CCA ACG AGA AAC ATG GCA GTA GAG GTA GCA AAG AAG CTA AAA GCT AAG GGC TAT AAC 6694 

2084 VPTRNMAVEVAKKLKAK GYN 2103 

6695 TCT GGA TAC TAT TAC AGT GGA GAG GAT CCA GCC AAT CTG AGA GTT GTG ACA TCA CAA TCC 6754 

2104 SGYYYSGEDPANLRVVTSQS 2123 

6755 CCC TAT GTA ATC GTG GCT ACA AAT GCT ATT GAA TCA GGA GTG ACA CTA CCA GAT TTG GAC 6814 

2124 PYVIVATNAIESGVTLPDLD 2143 

6815 ACG GTT ATA GAC ACG GGG TTG AAA TGT GAA AAG AGG GTG AGG GTA TCA TCA AAG ATA CCC 6874 

2144 TVIDTGLKCEKRVRVSSKIP 2163 
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6875 TTC ATC GTA AC A GGC CTT AAG AGG ATG GCC GTG ACT GTG GGT GAG CAG GCG CAG CGT AGG 
2164 FI.VTGLKRMAVTVGEQAQRR 

6935 GGC AGA GTA GGT AGA GTG AAA CCC GGG AGG TAT TAT AGG AGC CAG GAA ACA GCA ACA GGG 
2184 GRVGRV K PGRYY RSQETATG 

6995 TCA AAG GAC TAC CAC TAT GAC CTC TTG CAG GCA CAA AGA TAC GGG ATT GAG GAT GGA ATC 
2204 SKDYHYDLLQAQRYGI EDGI 

7055 AAC GTG ACG AAA TCC TTT AGG GAG ATG AAT TAC GAT TGG AGC CTA TAC GAG GAG GAC AGC 
2224 NVTKSFREMNYDWSLYEEDS 

7115 CTA CTA ATA ACC CAG CTG GAA ATA CTA AAT AAT CTA CTC ATC TCA GAA GAC TTG CCA GCC 
2244 LLITQLEI LNNLLI SEDLPA 

7175 GCT GTT AAG AAC ATA ATG GCC AGG ACT GAT CAC CCA GAG CCA ATC CAA CTT GCA TAC AAC 
2264 AVKNIMARTDHPEPIQLAYN 

7235 AGC TAT GAA GTC CAG GTC CCG GTC CTG TTC CCA AAA ATA AGG AAT GGA GAA GTC ACA GAC 
2284 SYEVQVPVLFPK IRNGEVTD 

7295 ACC TAC GAA AAT TAC TCG TTT CTA AAT GCC AGA AAG TTA GGG GAG GAT GTG CCC GTG TAT 
2304 TYENYSFLNARKLGEDVPVY 

7355 ATC TAC GCT ACT GAA GAT GAG GAT CTG GCA GTT GAC CTC TTA GGG CTA GAC TGG CCT GAT 
2324 IYATEDEDLAVDLLGLDWPD 

7415 CCT GGG AAC CAG CAG GTA GTG GAG ACT GGT AAA GCA CTG AAG CAA GTG ACC GGG TTG TCC 
2344 PGNQQVVETGKALKQVTGLS 

7475 TCG GCT GAA AAT GCC CTA CTA GTG GCT TTA TTT GGG TAT GTG GGT TAC CAG GCT CTC TCA 
2364 SAENALLVALFGYVGYQALS 

7535 AAG AGG CAT GTC CCA ATG ATA ACA GAC ATA TAT ACC ATC GAG GAC CAG AGA CTA GAA GAC 
2384 KRHVPMITDIYT-IEDQRLED 

7595 ACC ACC CAC CTC CAG TAT GCA CCC AAC GCC ATA AAA ACC GAT GGG ACA GAG ACT GAA CTG 
2404 TTHLQYAPNAI KTDGTETEL 

7655 AAA GAA CTG GCG TCG GGT GAC GTG GAA AAA ATC ATG GGA GCC ATT TCA GAT TAT GCA GCT 
2424 KELASGDVEKIMG AI SDYAA 

7715 GGG GGA CTG GAG TTT GTT AAA TCC CAA GCA GAA AAG ATA AAA ACA GCT CCT TTG TTT AAA 
2444 GGLEFVKSQAEKIKTAFLFK 

7775 GAA AAC GCA GAA GCC GCA AAA GGG TAT GTC CAA AAA TTC ATT GAC TCA TTA ATT GAA AAT 
2464 ENAEAAKGYVQKFI DS L I EN 

7835 AAA GAA GAA ATA ATC AGA TAT GGT TTG TGG GGA ACA CAC ACA GCA CTA TAC AAA AGC ATA 
2484 KEEI IRYGLWGTHTALYKS I 

7895 GCT GCA AGA CTG GGG CAT GAA ACA GCG TTT GCC ACA CTA GTG TTA AAG TGG CTA GCT TTT 
2504 AARLGHETAFATLVLKWLAF 

7955 GGA GGG GAA TCA GTG TCA GAC CAC GTC AAG CAG GCG GCA GTT GAT TTA GTG GTC TAT TAT 
2524 GGESVSDHVKQAAVDLVVYY 

8015 GTG ATG AAT AAG CCT TCC TTC CCA GGT GAC TCC GAG ACA CAG CAA GAA GGG AGG CGA TTC 
2544 VMNKPSFPGDSETQQEGRRF 

8075 GTC GCA AGC CTG TTC ATC TCC GCA CTG GCA ACC TAC ACA TAC AAA ACT TGG AAT TAC CAC 
2564 VASLFI SALATYTYKTWNYK 

8135 AAT CTC TCT AAA GTG GTG GAA CCA GCC CTG GCT TAC CTC CCC TAT GCT ACC AGC GCA TTA 
2584 NLSKVVEPALAYLPYATSAL 

8195 AAA ATG TTC ACC CCA ACG CGG CTG GAG AGC GTG GTG ATA CTG AGC ACC ACG ATA TAT AAA 
2604 KM FT PTR L E S V V I L STT I Y K 

8255 ACA TAC CTC TCT ATA AGG AAG GGG AAG AGT GAT GGA TTG CTG GGT ACG GGG ATA ACT GCA 
2624 TYLSIRKGKSDGLLGTGI SA 

8315 GCC ATG GAA ATC CTG TCA CAA AAC CCA GTA TCG GTA GGT ATA TCT GTG ATG TTG GGG GTA 
2644 AMEILSQNPVSVGISVMLGV 

8375 GGG GCA ATC GCT GCG CAC AAC GCT ATT GAG TCC AGT GAA CAG AAA AGG ACC CTA CTT ATG 
2664 GAIAAHNAIESSEQKRTLLM 

8435 AAG GTG TTT GTA AAG AAC TTC TTG GAT CAG GCT GCA ACA GAT GAG CTG GTA AAA GAA AAC 
2684 KVFVKNFLDQAATDELVKEN 

8495 CCA GAA AAA ATT ATA ATG GCC TTA TTT GAA GCA GTC CAG ACA ATT GGT AAC CCC CTG AGA 
2704 PEKIIMALFEAVQTIGNPLR 

8555 CTA ATA TAC CAC CTG TAT GGG GTT TAC TAC AAA GGT TGG GAG GCC AAG GAA CTA TCT GAG 
2724 LIYHLYGVYYKGWEAKELSE 
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8615 AGG ACA GCA GGC AGA AAC TTA TTC ACA TTG ATA ATG TTT GAA GCC TTC GAG TTA TTA GGG 
2744 RTAGRNLFTLtMFEA F E L L C 

8675 ATG GAC TCA CAA GGG AAA ATA AGG AAC CTG TCC GGA AAT TAC ATT TTG GAT TTG ATA TAC 
2764 MDSQGKI RN LSGNY I L DL IY 



8674 
2763 



8734 
2783 



8735 GGC CTA CAC AAG CAA ATC AAC AGA GGG CTG AAG AAA ATG GTA CTG GGG TGG GCC CCT GCA 8794 

2784 GLHKQINRGLKKMVLGWAPA 2803 

8795 CCC TTT AGT TGT GAC TGG ACC CCT AGT GAC GAG AGG ATC AGA TTG CCA ACA GAC AAC TAT 8854 

2804 PFSCDWTPSDERIRL PTDNY 2823 

8855 TTG AGG GTA GAA ACC AGG TGC CCA TGT GGC TAT GAG ATG AAA GCT TTC AAA AAT GTA GGT 8914 

2824 LRVETRC PCGYEMKA FKNVG 2843 



8915 GGC AAA CTT ACC AAA GTG GAG GAG AGC GGG CCT TTC CTA TGT AGA AAC AGA CCT GGT AGG 
2844 GKLTKVEESGPFLCRNR PGR 



8974 
2863 



8975 GGA CCA GTC AAC TAC AGA GTC ACC AAG TAT TAC GAT GAC AAC CTC AGA GAG ATA AAA CCA 9034 

2864 GPVNYRVTKYYDD.nl RE I K P 2883 

9035 GTA GCA AAG TTG GAA GGA CAG GTA GAG CAC TAC TAC AAA GGG GTC ACA GCA AAA ATT GAC 9094 

2884 VAKLEGQVEHYYKGVTAK ID 2903 

9095 TAC AGT AAA GGA AAA ATG CTC TTG GCC ACT GAC AAG TGG GAG GTG GAA CAT GGT GTC ATA 9154 

2904 YSKGKMLLATDKWEVEHGVI 2923 

9155 ACC AGG TTA GCT AAG AGA TAT ACT GGG GTC GGG TTC AAT GGT GCA TAC TTA GGT GAC GAG 9214 

2924 TRLAKRYTGVGFNGAYLGDE 2943 

9215 CCC AAT CAC CGT GCT CTA GTG GAG AGG GAC TGT GCA ACT ATA ACC AAA AAC ACA GTA CAG 9274 

2944 PNHRALVERDCATIT KNTVQ 2963 

9275 TTT CTA AAA ATG AAG AAG GGG TGT GCG TTC ACC TAT GAC CTG ACC ATC TCC AAT CTG ACC 9334 

2964 FLKMKKGCAFTY DLT I SNLT 2983 



9335 AGG CTC ATC GAA CTA GTA CAC AGG AAC AAT CTT GAA GAG AAG GAA ATA CCC ACC GCT ACG 9394 
2984 RLI ELVH RNNLEEKE IPTAT 3003 



SO 

f 



9395 GTC ACC ACA TGG CTA GCT TAC ACC TTC GTG AAT GAA GAC GTA GGG ACT ATA AAA CCA GTA 9454 

3004 VTTWLAYTFVNEDVGTI KPV 3023 

9455 CTA GGA GAG AGA GTA ATC CCC GAC CCT GTA GTT GAT ATC AAT TTA CAA CCA GAG GTG CAA 9514 

3024 LGERVIPDPVVDINL QPEVQ 3043 

9515 GTG GAC ACG TCA GAG GTT GGG ATC ACA ATA ATT GGA AGG GAA ACC CTG ATG ACA ACG GGA 9574 

3044 VDTSEVG ITI IGRET LMTTG 3063 

9575 GTG ACA CCT GTC TTG GAA AAA GTA GAG CCT GAC GCC AGC GAC AAC CAA AAC TCG GTG AAG 9634 

3064 VTPVLEKVEPDASDNQNSVK 3083 



5 

o 



9635 ATC GGG TTG GAT GAG GGT AAT TAC CCA GGG CCT GGA ATA CAG ACA CAT ACA CTA ACA GAA 9694 

3084 IGLDEGNYPGPGIQT HTLTE 3103 

9695 GAA ATA CAC AAC AGG GAT GCG AGG CCC TTC ATC ATG ATC CTG GGC TCA AGG AAT TCC ATA 9754 

3104 EIHNRDARPFIMILGSRNSI 3123 

9755 TCA AAT AGG GCA AAG ACT GCT AGA AAT ATA AAT CTG TAC ACA GGA AAT GAC CCC AGG GAA 9814 

3124 SNRAKTARNINLYTGNDPRE 3143 

9815 ATA CGA GAC TTG ATG GCT GCA GGG CGC ATG TTA GTA GTA GCA CTG AGG GAT GTC GAC CCT 9874 

3144 IRDLMAAGRMLVVALRDVDP 3163 



9875 GAG CTG TCT GAA ATG GTC GAT TTC AAG GGG ACT TTT TTA GAT AGG GAG GCC CTG GAG GCT 
3164 ELSEMVDFKGTFLDR EALEA 



9934 
3183 



9935 CTA AGT CTC GGG CAA CCT AAA CCG AAG CAG GTT ACC AAG GAA GCT GTT AGG AAT TTG ATA 9994 

3184 LSLGQPKPKQVT KEAVRNLI 3203 

9995 GAA CAG AAA AAA GAT GTG GAG ATC CCT AAC TGG TTT GCA TCA GAT GAC CCA GTA TTT CTG 10054 

3204 EQKKDVEI PNWFASDDPVFL 3223 



10055 GAA GTG GCC TTA AAA AAT GAT AAG TAC TAC TTA GTA GGA GAT GTT GGA GAG CTA AAA GAT 
3224 EVALKNDKYYLVGDVGELKD 



10114 
3243 



10115 CAA GCT AAA GCA CTT GGG GCC ACG GAT CAG ACA AGA ATT ATA AAG GAG GTA GGC TCA AGG 
3244 QAKALGATDQTR I IKEVGSR 



10174 
3263 



10175 ACG TAT GCC ATG AAG CTA TCT AGC TGG TTC CTC AAG GCA TCA AAC AAA CAG ATG AGT TTA 10234 

3264 TYAMKLSSWFLKASNKQMSL 3283 

10235 ACT CCA CTG TTT GAG GAA TTG TTG CTA COG TGC CCA CCT GCA ACT AAG AGC AAT AAG GGG 10294 

3284 TPLFEELLLRC PPATKSNKG 3303 



10295 CAC ATG GCA TCA GCT TAC CAA TTG GCA CAG GGT AAC TGG GAG CCC CTC GGT TGC GGG GTG 
3304 HMASAYQLAQGNWEPLGCGV 



10354 
3323 
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10355 CAC CTA GCT AC A ATA CCA CCC AGA AGG GTG AAG ATA CAC CCA TAT GAA GCT TAC CTG AAG 
3324 HL .GTI PARRVK I H P Y E A Y L K 

10415 TTG AAA GAT TTC ATA GAA GAA GAA GAG AAG AAA CCT AGG GTT AAG GAT ACA GTA ATA AGA 
3344 LKDF I E EEEKK PRVKDTV I R 

10475 GAG CAC AAC AAA TGG ATA CTT AAA AAA ATA AGG TTT CAA GGA AAC CTC AAC ACC AAG AAA 
3364 EHNKWI LKK I RFQGNLNTKK 

10535 ATG CTC AAC CCC GGG AAA CTA TCT GAA CAG TTG GAC AGG GAG GGG CGC AAG AGG AAC ATC 
3384 MLNPGKLSEQ LDREGRKRNI 

10595 TAC AAC CAC CAG ATT GGT ACT ATA ATG TCA AGT GCA GGC ATA AGG CTG GAG AAA TTG CCA 
3404 YNHQIGTIMSSAGI RLEKLP 

10655 ATA GTG AGG GCC CAA ACC GAC ACC AAA ACC TTT CAT GAG/GCA ATA AGA GAT AAG ATA GAC 
3424 IVRAQTDTKTFHE {%A I R D K I D 

10715 AAG AGT GAA AAC CGG CAA AAT CCA GAA TTG CAC AAC AAA TTG TTG GAG ATT TTC CAC ACG 
3444 KSENRQNPELHNK:. '-LLEIFHT 

10775 ATA GCC CAA CCC ACC CTG AAA CAC ACC TAC GGT GAG GTG ACG TGG GAG CAA CTT GAG GCG 
3464 IAQPTLKHTYGEV *TT W E Q U E A 

10835 GGG ATA AAT AGA AAG GGG GCA GCA GGC TTC CTG GAG AAG : AAG AAC ATC GGA GAA GTA TTG 
3484 GINRKGAAGFLEK KNIGEVL 

10895 GAT TCA GAA AAG CAC CTG GTA GAA CAA TTG GTC AGG GAT- CTG AAG GCC GGG AGA AAG ATA 
3504 DSEKHLVEQLVRD '\h K A G R K I 

M 

10955 AAA TAT TAT GAA ACT GCA ATA CCA AAA AAT GAG AAG AGA CAT GTC AGT GAT GAC TGG CAG 
3524 KYYETAI PKNEKRDVSDDWQ 

11015 GCA GGG GAC CTG GTG GTT GAG AAG AGG CCA AGA GTT ATC CAA TAC CCT GAA GCC AAG ACA 
3544 AGDLVVEKR P R V I QYP E A KT 

11075 AGG CTA GCC ATC ACT AAG GTC ATG TAT AAC TGG GTG AAA CAG CAG CCC GTT GTG ATT CCA 
3564 RLAITKVMYNWVKQQPVVIP 

11135 GGA TAT GAA GGA AAG ACC CCC TTG TTC AAC ATC TTT GAT AAA GTG AGA AAG GAA TGG GAC 
3584 GYEGKTPLFNI FDKVRKEWD 

11195 TCG TTC AAT GAG CCA GTG GCC GTA AGT TTT GAC ACC AAA GCC TGG GAC ACT CAA GTG ACT 
3604 SFNEPVAVSFDTKAWDTQVT 

11255 AGT AAG GAT CTG CAA CTT ATT GGA GAA ATC CAG AAA TAT TAC TAT AAG AAG GAG TGG CAC 
3624 SKDLQLIGEI QKYYYKKEWH 

11315 AAG TTC ATT GAC ACC ATC ACC GAC CAC ATG ACA GAA GTA CCA GTT ATA ACA GCA GAT GGT 
3644 KFIDTITDHMTEVPVITADG 

11375 GAA GTA TAT ATA AGA AAT GGG CAG AGA GGG AGC GGC CAG CCA GAC ACA AGT GCT GGC AAC 
3664 EVYIRNGQRGSGQPDTSAGN 

11435 AGC ATG TTA AAT GTC CTG ACA ATG ATG TAC GGC TTC TGC GAA AGC ACA GGG GTA CCG TAC 
3684 SMLNVLTMMYGFCESTGVPY 

11495 AAG AGT TTC AAC AGG GTG GCA AGG ATC CAC GTC TCT GGG GAT GAT GGC TTC TTA ATA ACT 
3704 KSFNRVARI HVCGDDGFL IT 

11555 GAA AAA GGG TTA GGG CTG AAA TTT GCT AAC AAA GGG ATG CAG ATT CTT CAT GAA GCA GGC 
3724 EKGLGLKFANKGMQILHEAG 

11615 AAA CCT CAG AAG ATA ACG GAA GGG GAA AAG ATG AAA GTT GCC TAT AGA TTT GAG GAT ATA 
3744 KPQKITEGEKMKV AYRF EDI 

11675 GAG TTC TGT TCT CAT ACC CCA GTC CCT GTT AGG TGG TCC GAC AAC ACC AGT AGT CAC ATG 
3764 EFCSHTPVPVRWSDNTSSHM 

11735 GCC GGG AGA GAC ACC GCT GTG ATA CTA TCA AAG ATG GCA ACA AGA TTG GAT TCA AGT GGA 



3784 A 



K 



M 



11795 GAG AGG GGT A 'C ACA GCA TAT GAA AAA GCG GTA GCC TTC AGT TTC TTG CTG ATG TAT TCC 
3804 ERG 7 A Y E K A V A F S F L L M Y S 



11855 TGG AAC CCG 
3824 W N P 



GTT AGG AGG ATT TGC CTG TTG GTC CTT TCG CAA CAG CCA GAG ACA GAC 
VRRICLLVLSQQPETD 



11915 CCA TCA AAA CAT GCC ACT TAT TAT TAC AAA GGT GAT CCA ATA GGG GCC TAT AAA GAT GTA 
3844 PSKHATYYYKGDFIGAYKDV 

11975 ATA GGT CGG AAT CTA AGT GAA CTG AAG AGA ACA GGC TTT GAG AAA TTG GCA AAT CTA AAC 
3864 IGRNLSELKRTGFEKLANLN 

12035 CTA AGC CTG TCC ACG TTG GGG ATC TGG ACT AAG CAC ACA AGC AAA AGA ATA ATT CAG GAC 
3884 LSLSTLGIWTKHTSKRIIQD 



10414 




3343 




10474 




3363 




10534 




3383 




10594 




3403 




10654 




3423 




10714 




3443 
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3463 




10834 




3483 




10894 




3503 




10954 




3523 




11014 




JO** j 




11074 




3563 




11134 




3583 






fx*. 


11194 


1 


3603 




11254 




3623 




11314 




3643 






to 


11374 




3663 




11434 




3683 




11494 




3703 




11554 




3723 




11614 
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11674 
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3803 




11854 
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12095 TGT GTT GCC ATT GGG AAA GAA GAG GGC AAC TGG CTA GTT AAC GCC GAC AGG CTG ATA TCC 12154 
3904 CVAIGKEEGNWLVNADR L I S 3923 

12155 AGC AAA ACT GGC CAC TTA TAC ATA CCT GAT AAA GGC TTT ACA TTA CAA GGA AAG CAT TAT 12214 
3924 SKTGHLYI PDKGFTLQGKHY 3943 

12215 GAG CAA CTG CAG CTA AGA ACA GAG ACA AAC CCG GTC ATG GGG GTT GGG ACT GAG AGA TAC 12274 
3944 EQLQLRTETN PVMGVGTERY 3963 

12275 AAG TTA GGT CCC ATA GTC AAT CTG CTG CTG AGA AGG TTG AAA ATT CTG CTC ATG ACG GCC 12334 
3964 KLGPIVNLLLRRLKILLMTA 3983 

12335 GTC GGC GTC AGC AGC TGA gacaaaacgtatatatcgtaaataaattaatccatgtacacagtgtatataaatat 12408 
3984 V G V S S * 3989 

12409 agccgggaccgcccacctcaagaagacgacacgcccaacacgcacagccaaacagcagccaagactacctaccccaagac 12488 

12489 aacaccacatccaacgcacacagcaccccagccgcacgaggacacgcccgacgcctacagccggaccagggaagaccccc 12568 

12569 aacagccccccgcaggttaatcaaccagtgggaacacgcggggcacgccgcgctccagcacaccgacgacccaactccca 12648 

12649 cgttcgacagcccaccaccgccgagcaagacgccccccgttgaatatggcccacaacacccctcgcaccaccgtttacgc 12728 

12729 aagcagacagtcctactgcccatgacgacacatLtttatctcgcgcaatgcaacaccagagaccctgagacacgtggctt 12808 

12809 cgccgaacaaatcgaactttcgctgagctgaaggaccagatcacgcaccctcccgacaacgcagaccgtcccgcggcaaa 12888 

12889 gcaaaagttcaaaaccaccaactggtccacccacaacaaagctctcatcaaccgtggctccctcactttctggccggatg 12968 

12969 atggggcgatccaggcctggtacgagtcagcaacaccctctccacgaggcagaccccagcgccagcggagtgtacaccgg 13048 

13049 cccaccatgccggcactgacgagggtgccagcgaagcgctccatgtggcaggagaaaaaaggctgcaccggcgcgtcagc 13128 

13129 agaatatgtgatacaggatacattccgccccctcgcccaccgactcgctacgcccggtcgctcgaccgcggcgagcggaa 13208 

13209 a cggc c c acgaac ggggcggaga c c ccc cggaaga tgccaggaaga t ac 1 1 aacagggaagcgagagggccgcggcaaag 13288 

13289 ccgtttctccataggctccgcccccctgacaagcatcacgaaacccgacgctcaaatcagtggcggcgaaacccgacagg 13368 

1-3369 accacaaagataccaggcgctccccccggcggcccccccgtgcgctctcccgcccctgcctctcggtctaccggtgccat 13448 

13449 cccgccgtcacggccgcgcttgcctcattccacgcccgacactcagtcccgggcaggcagtccgccccaagctggaccgt 13528 

13529 acgcacgaaccccccgctcagtccgaccgctgcgccctatccggcaaccatcgtctcgagcccaacccggaaagacatgc 13608 

13609 aaaagcaccaccggcagcagccaccggtaaccgacctagaggagccagccttgaagtcatgcgccggccaaggctaaacc 13688 

13689 gaaaggacaagttttggcgactgcgcccctccaagccagccaccccggttcaaagagctggtagcccagagaacctccga 13768 

13769 aaaaccgccccgcaaggcggccctttcgctcccagagcaagagaccacgcgcagaccaaaacgacctcaagaagatcatc 13848 

13849 ctactaaggggtctgacgcccagtggaacgaaaacccacgttaagggacctcggccacgagaccatcaaaaaggatctcc 13928 

13929 acccagacccctttaaattaaaaacgaagcttcaaatcaatccaaagtatatatgagcaaactcggtctgacagt caeca 14008 

14009 acgcctaatcagcgaggcacctatctcagcgacctgtctacttcgttcacccacagctgcccgactccccgccgcgcaga 14088 

14089 taaccacgacacgggagggcttaccatccggccccagtgctgcaacgataccgcgagacccacgctcaccggctccagac 14168 

14169 tcatcagcaataaaccagccagccggaagggccgagcgcagaagtggccctgcaaccctatccgcccccatccagtctat 14248 

14249 taatcgtcgccgggaagctagagcaagcagctcgccagtcaatagcctgcgcaacgctgccgccaccgccgcaggcatcg 14328 

14329 tggtgtcacgctcgccgtccggcacggccccactcagctccggcccccaacgatcaaggcgagttacatgatcccccacg 14408 

14409 ccgcgcaaaaaagcggtcagccccctcggccctccgatcgttgccagaagtaagttggccgcagcgccaccactcacggt 14488 

14489 cacggcagcactgcacaacccccctaccgccatgccatccgtaagacgctcttctgcgactggtgagcacccaaccaagt 14568 

14569 caccccgagaacagcgcacgcggcgaccgagccgctcccgcccggcgccaacacgggacaacaccgcgccacatagcaga 14648 

14649 accccaaaagcgcccaccaccggaaaacgccccccggggcgaaaaccctcaaggaccctaccgctgccgagatccagccc 14728 

14729 gacgtaacccactcgcgcacccaaccgatccccagcacccctcacttccaccagcgtccccgggcgagcaaaaacaggaa 14808 

14809 ggcaaaacgccgcaaaaaagggaataagggcgacacggaaacgttgaacactcacacccccccttctccaacattattga 14888 

14889 agcacccaccagggctatcgtctcacgagcggacacacatccgaacgcacccagaaaaacaaacaaacaggggctccgcg 14968 

• 14969 cacacccccccgaaaagcgccacccgacgccgacccgaggcaattataacccgggccccacatatggatccaactctaga 15048 

15049 taacacgactcaccaca 15065 



WO 99/55366 



PCT/US99/08850 



22/67 

BVDV NADL (inf. clone) -> Genes 

DNA sequence 12578 b.p. . gtatacgagaat ... ctaacagccccc linear 

1 gcatacgagaattagaaaaggcacccgcatacgcattgggcaactaaaaacaataaccaggcccagggaacaaatccctc 80 

81 tcagcgaaggccgaaaagaggctagccacgcccccagtaggaccagcacaacgaggggggtagcaacagcggtgagcccg 1 60 

161 tcggacggcccaagccccgagcacagggtagtcgccagcggcccgacgccccggaacaaaggtcccgagatgccacgtgg 240 

241 acgagggcacgcccaaagcacatcccaacccgagcgggggccgcccaggcaaaagcagcttcaaccgaccgccacgaaca 320 

321 cagcccgacagggcgccgcagaggcccactgcaccgccaccaaaaatctccgccgcacatggcac ATG GAG TTG 394 
^ M E L 3 

395 ATC ACA AAT GAA CTT TTA TAC AAA ACA TAC AAA CAA AAA CCC GTC GGG GTG GAG GAA CCT 454 
41 TNELLY KTYKQK PVGVEE P 23 

455 GTT TAT GAT CAG GCA GGT GAT CCC TTA TTT GGT GAA AGG GGA GCA GTC CAC CCT CAA TCG 514 
24VYDQAGDP LFGERGAVHPQS 43 

515 ACG CTA AAG CTC CCA CAC AAG AGA GGG GAA CGC GAT GTT CCA ACC AAC TTG GCA TCC. TTA 574 
44TLKLPHKRGERDVPTNLASL 63 

575 CCA AAA AGA GGT GAC TGC AGG TCG GGT AAT AGC AGA GGA CCT GTG AGC GGG ATC TAC CTG 634 
64PKRGDCRSGNSRGPVSGIYL 83 

635 AAG CCA GGG CCA CTA TTT TAC CAG GAC TAT AAA GGT CCC GTC TAT CAC AGG GCC CCG CTG 694 
84KPGPLFYQDYKGPVYHRAPL 103 

695 GAG CTC TTT GAG GAG GGA TCC ATG TGT GAA ACG ACT AAA CGG ATA GGG AGA GTA ACT GGA 754 
104 ELFEEGSMCBTTKRIGRVTG 123 

755 ACT GAC GGA AAG CTG TAC CAC ATT TAT GTG TGT ATA GAT GGA TGT ATA ATA ATA AAA AGT 814 
124SDGKLYHIYVCIDGCIIIK S 143 

815 GCC ACG AGA AGT TAC CAA AGG GTG TTC AGG TOG GTC CAT AAT AGG CTT GAC TGC CCT CTA 874 
144 ATRSYQRVFRWVHNRLDCP L 163 

875 TGG GTC ACA ACT TGC TCA GAC ACG AAA GAA GAG GGA GCA ACA AAA AAG AAA ACA CAG AAA 934 l 
164 WVTTCSDTKEEGATKKKTQK 183 £ 

935 CCC GAC AGA CTA GAA AGG GGG AAA ATG AAA ATA GTG CCC AAA GAA TCT GAA AAA GAC AGC 994 ^ 
184PDRLERGKMKIVPKESEKDS 203 ^ 

995 AAA ACT AAA CCT CCG GAT GCT ACA ATA GTG GTG GAA GGA GTC AAA TAC CAG GTG AGG AAG 1054 3 
204 KTKPPDATIVVEGVKYQVRK 223 

1055 AAG GGA AAA ACC AAG AGT AAA AAC ACT CAG GAC GGC TTG TAC CAT AAC AAA AAC AAA CCT 1114 fa 
224 KGKTKSKNTQDGLYHNKNKP 243 

1115 CAG GAA TCA CGC AAG AAA CTG GAA AAA GCA TTG TTG GCG TGG GCA ATA ATA GCT ATA GTT 1174 
244 QES RKKLEKALLAWAIIAIV 263 

1175 TTG TTT CAA GTT ACA ATG GGA GAA AAC ATA ACA CAG TGG AAC CTA CAA GAT AAT GGG ACG 1234 
264 LFQVTMGENITQWN LQDNGT 283 

1235 GAA GGG ATA CAA CGG GCA ATG TTC CAA AGG GGT GTG AAT AGA AGT TTA CAT GGA ATC TCG 1294 
284 EG IQRAMFQRGVN.RSLK GIW 303 

1295 CCA GAG AAA ATC TGT ACT GGT GTC CCT TCC CAT CTA GCC ACC GAT ATA GAA CTA AAA ACA 1354 
304 PEKICTGVPSHLATDIELKT 323 

1355 ATT CAT GGT ATG ATG GAT GCA AGT GAG AAG ACC AAC TAC ACG TGT TGC AGA CTT CAA CGC 1414 
324 IHGMMDASEKTN YTCCRLQR 343 

1415 CAT GAG TGG AAC AAG CAT GGT TGG TGC AAC TGG TAC AAT ATT GAA CCC TGG ATT CTA GTC 1474 
344 HEWNKHGWCNWYNI EPWILV 363 

1475 ATG AAT AGA ACC CAA GCC AAT CTC ACT GAG GGA CAA CCA CCA AGG GAG TGC GCA GTC ACT 1534 
364 MNRTQANLTEGQPPRECAVT 383 

1535 TGT AGG TAT GAT AGG GCT AGT GAC TTA AAC GTG GTA ACA CAA GCT AGA GAT AGC CCC ACA 1594 
384 CRYDRASDLNVVTQARDSPT 403 

1595 CCC TTA ACA GGT TGC AAG AAA GGA AAG AAC TTC TCC TTT GCA GGC ATA TTG ATG CGG GGC 1654 
404 PLTGCKKGKNFSFAGILMRG 423 

1655 CCC TGC AAC TTT GAA ATA GCT GCA AGT GAT GTA TTA TTC AAA GAA CAT GAA CGC ATT AGT 1714 
424 PCNFEIAASDVUFKEHERIS 443 

1715 ATG TTC CAG GAT ACT ACT CTT TAC CTT GTT GAC GGG TTG ACC AAC TCC TTA GAA GGT GCC 1774 
444 MFQDTTLYLVDGLTNSLEGA 463 
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1775 AGA CAA GGA ACC GCT AAA CTG ACA ACC TGG TTA GGC AAG CAG CTC GGG ATA CTA GGA AAA 1834 

464 RQ GTAKLTTWLGKQLGILGK 483 

1835 AAG TTG GAA AAC AAG AGT AAG ACG TGG TTT GGA GCA TAG GCT GCT TCC CCT TAG TGT GAT 1894 

484 KLENKSKTWFGAYAASPYCD 503 

1895 GTC GAT CGC AAA ATT GGC TAC ATA TGG TAT ACA AAA AAT TGC ACC CCT GCC TGC TTA CCC 1954 

504 VDRKIGYIWYTKNCT PACLP 523 

1955 AAG AAC ACA AAA ATT GTC GGC CCT GGG AAA TTT GAC ACC AAT CCA GAG GAC GGC AAG ATA 2014 

524 KNTKIVGPGKFDTNAEDGK I 543 

2015 TTA CAT GAG ATG GGG GGT CAC TTG TCG GAG GTA CTA CTA CTT TCT TTA GTC GTG CTG TCC 2074 

544 LHEMGGHLSEVLLLSLVVLS 563 

2075 GAC TTC GCA CCG GAA ACA GCT AGT GTA ATG TAC CTA ATC CTA CAT TTT TCC ATC CCA CAA 2134 

564 DFAPETASVMYLI LHFSI PQ 583 

2135 AGT CAC GTT GAT GTA ATG GAT TGT GAT AAG ACC CAG TTG AAC CTC ACA GTG GAG CTG ACA 2194 

584 SHVDVKDCDKTQLNLTVELT 603 

2195 ACA GCT GAA GTA ATA CCA GGG TCG GTC TGG AAT CTA GGC AAA TAT GTA TGT ATA AGA CCA 2254 

604 TAEVI PGSVWNLGKYVC I R P 623 

2255 AAT TGG TGG CCT TAT GAG ACA ACT GTA GTG TTG GCA TTT GAA GAG GTG AGC CAG GTG GTG 2314 

624 NWWPYETTVVLAFEEVSQVV 643 

2315 AAG TTA GTG TTG AGG GCA CTC AGA GAT TTA ACA CGC ATT TGG AAC GCT GCA ACA ACT ACT 2374 

644 KLVL.RALRDLTRIWNAATTT 663 

2375 GCT TTT TTA GTA TGC CTT GTT AAG ATA GTC AGG GGC CAG ATG GTA CAG GGC ATT CTG TGG 2434 

664 AFLVCLVKIVRGQMVQGI LW 683 

2435 CTA CTA TTG ATA ACA GGG GTA CAA GGG CAC TTG GAT TGC AAA CCT GAA TTC TCG TAT GCC 2494 

684 LLLITGVQGHLD CKPEFSYA 703 

2495 ATA GCA AAG GAC GAA AGA ATT GGT CAA CTG GGG GCT GAA GGC CTT ACC ACC ACT TGG AAG 2554 

704 IAKDERI GQL.GAEGLTTTWK 723 

2555 GAA TAC TCA CCT GGA ATG AAG CTG GAA GAC ACA ATG GTC ATT GCT TGG TGC GAA GAT GGG 2614 

724 EYSPGMKLEDTMVIAWCEDG 743 

2615 AAG TTA ATG TAC CTC CAA AGA TGC ACG AGA GAA ACC AGG TAT CTC GCA ATC TTG CAT ACA 2674 

744 KLMYLQ RCTRETRYLAI LHT 763 

2675 AGA GCC TTG CCG ACC AGT GTG GTA TTC AAA AAA CTC TTT GAT GGG CGA AAG CAA GAG GAT 2734 

764 RALPTSVVFKKLFDG RKQED 783 

2735 GTA GTC GAA ATG AAC GAC AAC TTT GAA TTT GGA CTC TGC CCA TGT GAT GCC AAA CCC ATA 2794 

784 VVEMNDNFEFGLCPCDAKPI 803 

2795 GTA AGA GGG AAG TTC AAT ACA ACG CTG CTG AAC GGA CCG GCC TTC CAG ATG GTA TGC CCC 2854 

804 VRGKFNTTLLNGPAFQMVC P 823 

2855 ATA GGA TGG ACA GGG ACT GTA AGC TGT ACG TCA TTC AAT ATG GAC ACC TTA GCC ACA ACT 2914 

824 IGWTGTVSCTSFNMDTLATT 843 

2915 GTG GTA CGG ACA TAT AGA AGG TCT AAA CCA TTC CCT CAT AGG CAA GGC TGT ATC ACC CAA 2974 

644 VVRTYRRSKPFPHRQGCITQ 863 

2975 AAG AAT CTG GGG GAG GAT CTC CAT AAC TGC ATC CTT GGA GGA AAT TGG ACT TGT GTG CCT 3034 

864 KNLGEDLHNCI LG'GNWTCVP 883 

3035 GGA GAC CAA CTA CTA TAC AAA GGG GGC TCT ATT GAA TCT TGC AAG TGG TGT GGC TAT CAA 3094 

884 GDQLLYKGGSI ESCKWCGYQ 903 

3095 TTT AAA GAG AGT GAG GGA CTA CCA CAC TAC CCC ATT GGC AAG TGT AAA TTG GAG AAC GAG 3154 
904 FKESEGLPHYPI GKCKLENE 923 

3155 ACT GGT TAC AGG CTA GTA GAC AGT ACC TCT TGC AAT AGA GAA GGT GTG GCC ATA GTA CCA 3214 
924 TGYRLVDSTSCNREGVAIVP 943 



O 



3215 CAA GGG ACA TTA AAG TGC AAG ATA GGA AAA ACA ACT GTA CAG GTC ATA GCT ATG GAT ACC 
944 QGTLKCK IGKTTVQV I AMDT 



3274 
963 



3275 AAA CTC GGA CCT ATG CCT TGC AGA CCA TAT GAA ATC ATA TCA AGT GAG GGG CCT GTA GAA 3334 

964 KLGPMPCR PYE I I SS EG PVE 983 

3335 AAG ACA GCG TGT ACT TTC AAC TAC ACT AAG ACA TTA AAA AAT AAG TAT TTT GAG CCC AGA 3394 

984 KTACTFNYTKTLKNKY FE P R 1003 

3395 GAC AGC TAC TTT CAG CAA TAC ATG CTA AAA GGA GAG TAT CAA TAC TGG TTT GAC CTG GAG 3454 

1004 DSYFQOYMLKGEYQYWFDLE 1023 



3455 GTG ACT GAC CAT CAC CGG GAT TAC TTC GCT GAG TCC ATA TTA GTG GTG GTA GTA GCC CTC 
1024 VTDHHRDYFAESI LVVVVAL 



3514 
1043 
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3515 TTG GGT GGC AG A TAT GTA CTT TGG TTA CTG GTT ACA TAC ATG GTC TTA TCA GAA CAG AAG 3574 
1044 LGGRYVLWLLVTYMVLSEQK 1063 

3575 GCC TTA GGG ATT CAG TAT GGA TCA GGG GAA GTG GTC ATG ATG GGC AAC TTG CTA ACC CAT 3634 
1064 ALGIQYCSGEVVMMGNLLTH 1083 

3635 AAC AAT ATT GAA GTG GTG ACA TAC TTC TTG CTG CTG TAC CTA CTG CTG AGG GAG GAG AGC 3694 
1064 NN'IEVVTYFLLLY LLLREES 1103 

3695 GTA AAG AAG TGG GTC TTA CTC TTA TAC CAC ATC TTA GTG GTA CAC CCA ATC AAA TCT GTA 3754 
1104 VKKWVLLLYHILVVHPIKSV 1123 

3755 ATT GTG ATC CTA CTG ATG ATT GGG GAT GTG GTA AAG GCC GAT TCA GGG GGC CAA GAG TAC 3814 
1124 IVILLMIGDVVKADSGGQEY 1143 

3815 TTG GGG AAA ATA GAC CTC TGT TTT ACA ACA GTA GTA CTA ATC GTC ATA GGT TTA ATC ATA 3874 
U44 LGKIDLCFTTVVLI VIGLII 1163 

3875 GCC AGG CGT GAC CCA ACT ATA GTG CCA CTG GTA ACA ATA ATG GCA GCA CTG AGG GTC ACT 3934 
1164 AR RDPTIVPLVTIMAALRVT 1183 

3935 GAA CTG ACC CAC CAG CCT GGA GTT GAC ATC GCT GTG GCG GTC ATG ACT ATA ACC CTA CTG 3994 
HB4 ELTHQPGVDIAVAVMTITL-L 1203 

3995 ATG GTT AGC TAT GTG ACA GAT TAT TTT AGA TAT AAA AAA TGG TTA CAG TGC ATT CTC AGC 4054 
1204 MVSYVTDYFRYKKWLQCILS 1223 

4055 CTG GTA TCT GCG GTG TTC TTG ATA AGA AGC CTA ATA TAC CTA GGT AGA ATC GAG ATG CCA 4114 
1224 LVSAVFLIRSLIYLGRIEMP 1243 

4115 GAG GTA ACT ATC CCA AAC TGG AGA CCA CTA ACT TTA ATA CTA TTA TAT TTG ATC TCA ACA 4174 
1244 EVTIPNWRPLTLILLYLIST 1263 

4175 ACA ATT GTA ACG AGG TGG AAG GTT GAC GTG GCT GGC CTA TTG TTG CAA TGT GTG CCT ATC 4234 
1264 TIVTRWKVDVAG LLLQCVPI 1283 

4235 TTA TTG CTG GTC ACA ACC TTG TGG GCC GAC TTC TTA ACC CTA ATA CTG ATC CTG CCT ACC 4294 
1284 LLLVTTLWADFLTLILILPT 1303 

4295 TAT GAA TTG GTT AAA TTA TAC TAT CTG AAA ACT GTT AGG ACT GAT ATA GAA AGA ACT TGG 4354 
1304 YELVKL YYLKTVRTDIERSW 1323 

4355 CTA GGG GGG ATA GAC TAT ACA AGA GTT GAC TCC ATC TAC GAC GTT GAT GAG AGT GGA GAG 4414 r^t 
1324 LGGIDYTRVDS IYDVDESGE 1343 i-t 

4415 GGC GTA TAT CTT TTT CCA TCA AGG CAG AAA GCA CAG GGG AAT TTT TCT ATA CTC TTG CCC 4474 H 
1344 GVYLFPSRQKAQGNFSI LLP 1363 £j 

4475 CTT ATC AAA GCA ACA. CTG ATA AGT TGC GTC AGC AGT AAA TGG CAG CTA ATA TAC ATG AGT 4534 O 
1364 LIKATLISCVSSKWQLIYMS 1383 

4535 TAC TTA ACT TTG GAC TTT ATG TAC TAC ATG CAC AGG AAA GTT ATA GAA GAG ATC TCA GGA 4594 
1384 YLTLDFMYYMHRKVIEEISG 1403 

4595 GGT ACC AAC ATA ATA TCC AGG TTA GTG GCA GCA CTC ATA GAG CTG AAC TGG TCC ATG GAA 4654 
1404 GTN I ISRLVAAL I ELNWSME 1423 

4655 GAA GAG GAG AGC AAA GGC TTA AAG AAG TTT TAT CTA TTG TCT GGA AGG TTG AGA AAC CTA 4714 
1424 EEESKGLKKFYLLSGRLRNL 1443 

4715 ATA ATA AAA CAT AAG GTA AGG AAT GAG ACC GTG GCT TCT TGG TAC GGG GAG GAG GAA GTC 4774 
1444 I IKHKVRNETVASWYGEEEV 1463 

4775 TAC GGT ATG CCA AAG ATC ATG ACT ATA ATC AAG GCC AGT ACA CTG AGT AAG AGC AGG CAC 4834 
1464 YGMPKIMTI I KASTLSKSRH 1483 

4835 TGC ATA ATA TGC ACT GTA TGT GAG GGC CGA GAG TGG AAA GGT GGC ACC TGC CCA AAA TGT 4894 
1484 CIICTVCEGREWKGGTCPKC 1503 

4895 GGA CCC CAT GGG AAG CCG ATA ACG TGT GGG ATG TCC CTA GCA GAT TTT GAA GAA AGA CAC 4954 
1504 GRHGKP1TCGMSLADFEERH 1523 

4955 TAT AAA AGA ATC TTT ATA AGG GAA GGC AAC TTT GAG GGT ATG TGC AGC CGA TGC CAG GGA 5014 
1524 YKRIFIREGNFEGMCSRCQG 1543 

5015 AAG CAT AGG AGG TTT GAA ATG GAC CGG GAA CCT AAG AGT GCC AGA TAC TGT GCT GAG TGT 5074 
1544 KHRRFEMDREPKSARYCAEC 1563 

5075 AAT AGG CTG CAT CCT GCT GAG GAA GGT GAC TTT TGG GCA GAG TCC AGC ATG TTG GGC CTC 5134 
1564 NRLHPAEEGDFWAESSMLGL 1583 

5135 AAA ATC ACC TAC TTT GCG CTG ATG GAT GGA AAG GTG TAT GAT ATC ACA GAG TGG GCT GGA 5194 
1584 KITYFALMDGKVYOITEWAG 1603 

5195 TCC CAG CCT GTC GGA ATC TCC CCA GAT ACC CAC AGA GTC CCT TGT CAC ATC TCA TTT GGT 5254 
1604 CQR VGISPDTHRVPCHISFG 1623 
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5255 TCA CGG ATG CCT TTC AGC CAG GAA TAC AAT GGC TTT GTA CAA TAT ACC GCT AGG GGG CAA 5314 

1624 SRMPFROEYNGF V QYTARGQ 1643 

5315 CTA TTT CTG AGA AAC TTG CCC GTA CTG GCA ACT AAA GTA AAA ATG CTC ATG GTA GGC AAC 5374 

1644 LFLRNL .PVLATKVKMLMVGN 1663 

5375 CTT GGA GAA GAA ATT GGT AAT CTG GAA CAT CTT GGG TGG ATC CTA AGG GGG CCT GCC GTG 5434 

1664 LGEEIGNLEHLGWILRGPAV 1683 

$435 TGT AAG AAG ATC ACA GAG CAC GAA AAA TGC CAC ATT AAT ATA CTG GAT AAA CTA ACC GCA 5494 

1684 CKKITEHEKCHINILDKLTA 1703 

5495 TTT TTC GGG ATC ATG CCA AGG GGG ACT ACA CCC AGA GCC CCG GTG AGG TTC CCT ACG AGC 5554 

1704 FFGIMPRGTTPRAPVRFPTS 1723 

5555 TTA CTA AAA GTG AGG AGG GGT CTG GAG ACT GCC TGG GCT TAC ACA CAC CAA GGC GGG ATA 5614 

1724 LLKVRRGLETAWAYTHQGG I 1743 

5615 ACT TCA GTC GAC CAT GTA ACC GCC GGA AAA GAT CTA CTG GTC TGT GAC AGC ATG GGA CGA 5674 

1744 SSVDHVTAGKDLLVCDSMGR 1763 

5675 ACT AGA GTG GTT TGC CAA AGC AAC AAC AGG TTG ACC GAT GAG ACA GAG TAT GGC GTC AAG 5734 

1764 TRVVCQSNNR LTDETEYGVK 1783 

5735 ACT GAC TCA GGG TGC CCA GAC GGT GCC AGA TGT TAT GTG TTA AAT CCA GAG GCC GTT AAC 5794 

1784 TDSGCPDGARCYVLNPEAVN 1803 

5795 ATA TCA GGA TCC AAA GGG GCA GTC GTT CAC CTC CAA AAG ACA GGT GGA GAA TTC ACG TGT 5854 

1804 ISGSKGAVVHLQKTGGEFTC 1823 

5855 GTC ACC GCA TCA GGC ACA CCG GCT TTC TTC GAC CTA AAA AAC TTG AAA GGA TGG TCA GGC 5914 

1824 VTASGTPAFFDLKNLKGWSG 1843 

5915 TTG CCT ATA TTT GAA GCC TCC AGC GGG AGG GTG GTT GGC AGA GTC AAA GTA GGG AAG AAT 5974 

1844 LPI FEASSGRVVGRVKVGKN 1863 

5975 GAA GAG TCT AAA CCT ACA AAA ATA ATG AGT GGA ATC CAG ACC GTC TCA AAA AAC AGA GCA 6034 

1864 EESKPTKI MSGIQTVSKNRA 1883 

6035 GAC CTG ACC GAG ATG GTC AAG AAG ATA ACC AGC ATG AAC AGG GGA GAC TTC AAG CAG ATT 6094 

1884 DLTEMVKK ITSMNRGDFKQI 1903 

6095 ACT TTG GCA ACA GGG GCA GGC AAA ACC ACA GAA CTC CCA AAA GCA GTT ATA GAG GAG ATA 6154 

1904 TLATGAGKTTELPKAV I EEI 1923 

6155 GGA AGA CAC AAG AGA GTA TTA GTT CTT ATA CCA TTA AGG GCA GCG GCA GAG TCA GTC TAC 6214 

1924 GRHKRVLVLIPLRAAAESVY 1943 

6215 CAG TAT ATG AGA TTG AAA CAC CCA AGC ATC TCT TTT AAC CTA AGG ATA GGG GAC ATG AAA 6274 

1944 QYMRLKHPSISFNLRIGDMK 1963 

6275 GAG GGG GAC ATG GCA ACC GGG ATA ACC TAT GCA TCA TAC GGG TAC TTC TGC CAA ATG CCT 6334 

1964 EGDMATGI TYASYGYFCQMP 1983 

6335 CAA CCA AAG CTC AGA GCT GCT ATG GTA GAA TAC TCA TAC ATA TTC TTA GAT GAA TAC CAT 6394 

1984 QPKLRAAMVEYSYI FLDE YH 2003 

6395 TGT GCC ACT CCT GAA CAA CTG GCA ATT ATC GGG AAG ATC CAC AGA TTT TCA GAG AGT ATA 6454 

2004 CATPEQLAIIGKIHRFSESI 2023 

6455 AGG GTT GTC GCC ATG ACT GCC ACG CCA GCA GGG TCG GTG ACC ACA ACA GGT CAA AAG CAC 6514 

2024 RVVAMTAT PAGSV TTTG QKH 2043 

6515 CCA ATA GAG GAA TTC ATA GCC CCC GAG GTA ATG AAA GGG GAG GAT CTT GGT AGT CAG TTC 6574 

2044 PIEEFIAPEVMKGEDLGSQF 2063 

6575 CTT GAT ATA GCA GGG TTA AAA ATA CCA GTG GAT GAG ATG AAA GGC AAT ATG TTG GTT TTT 6634 

2064 LDIAGLKI PVDE .MKGNMLVF 2083 

6635 GTA CCA ACG AGA AAC ATG GCA GTA GAG GTA GCA AAG AAG CTA AAA GCT AAG GGC TAT AAC 6694 

2084 VPTRNMAVEVAKKLKAKGYN 2103 

6695 TCT GGA TAC TAT TAC AGT GGA GAG GAT CCA GCC AAT CTG AGA GTT GTG ACA TCA CAA TCC 6754 

2104 SGYYYSGEDPANLRVVTSQS 2123 

6755 CCC TAT GTA ATC GTG GCT ACA AAT GCT ATT GAA TCA GGA GTG ACA CTA CCA GAT TTG GAC 6814 

2124 PYVIVATNAIESGVTLPDLD 2143 

6815 ACG GTT ATA GAC ACG GGG TTG AAA TGT GAA AAG AGG GTG AGG GTA TCA TCA AAG ATA CCC 6874 

2144 TVIDTGLKCEKRVRVSSKIP 2163 

6875 TTC ATC GTA ACA GGC CTT AAG AGG ATG GCC GTG ACT GTG GGT GAG CAG GCG CAG CGT AGG 6934 

2164 FIVTGLKRMAVTVGEQAQRR 2183 

6935 GGC AGA GTA GGT AGA GTG AAA CCC GGG AGG TAT TAT AGG AGC CAG GAA ACA GCA ACA GGG 6994 

2184 GRVGRVKPGRYYRSQETATG 2203 
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6995 TCA AAG GAC TAC CAC TAT GAC CTC TTG CAG GCA CAA AGA TAC GGG ATT GAG GAT GGA ATC 7054 
2204 S K D Y H Y D L LQAQRYG I E DG I 2223 

70S5 AAC GTG ACG AAA TCC TTT AGG GAG ATG AAT TAC GAT TGG AGC CTA TAC GAG GAG GAC AGC 7114 
2224 NVTKSF REMNYDWSLYEEDS 2243 

7115 CTA CTA ATA ACC CAG CTG GAA ATA CTA AAT AAT CTA CTC ATC TCA GAA GAC TTG CCA GCC 7174 
2244 LLITQLEI LNNLLISEDLPA 2263 

7175 GCT GTT AAG AAC ATA ATG GCC AGG ACT GAT CAC CCA GAG CCA ATC CAA CTT GCA TAC AAC 7234 
2264 AVKNIMARTDHPEPIQLAYN 2283 

7235 AGC TAT GAA GTC CAG GTC CCG GTC CTG TTC CCA AAA ATA AGG AAT GGA GAA GTC ACA GAC 7294 
2284 SYEVQVPVLFPKIRNGEVTD 2303 

7295 ACC TAC GAA AAT TAC TCG TTT CTA AAT GCC AGA AAG TTA GGG GAG GAT GTG CCC GTG TAT 7354 
2304 TYEtJYSFLNARKLGEDVPVY 2323 

7355 ATC TAC GCT ACT GAA GAT GAG GAT CTG GCA GTT GAC CTC TTA GGG CTA GAC TGG CCT GAT 7414 
2324 I YATEDED L A V D L L G L DW P D 2343 

7415 CCT GGG AAC CAG CAG GTA GTG GAG ACT GGT AAA GCA CTG AAG CAA GTG ACC GGG TTG TCC 7474 
2344 PGNQQVVETGKALKQVTGLS 2363 

747S TCG GCT GAA AAT GCC CTA CTA GTG GCT TTA TTT GGG TAT GTG GGT TAC CAG GCT CTC TCA 7534 
2364 SAENALLVALFGYVGYQALS 2383 

7535 AAG AGG CAT GTC CCA ATG ATA ACA GAC ATA TAT ACC ATC GAG GAC CAG AGA CTA GAA GAC 
2384 KRHVPM ITDIYTI ... E D Q R L E D 

7595 ACC ACC CAC CTC CAG TAT GCA CCC AAC GCC ATA AAA ACC, GAT GGG ACA GAG ACT GAA CTG 
2404 TTHLQYAPNAIKT DGTETEL 

7655 AAA GAA CTG GCG TCG GGT GAC GTG GAA AAA ATC ATG GGA GCC ATT TCA GAT TAT GCA GCT 
2424 KELASGDVEKIMGAISDYAA 

7715 GGG GGA CTG GAG TTT GTT AAA TCC CAA GCA GAA AAG ATA AAA ACA GCT CCT TTG TTT AAA 
2444 GGLEFVKSQAEKIKTAPLFK 

7775 GAA AAC GCA GAA GCC GCA AAA GGG TAT GTC CAA AAA TTC ATT GAC TCA TTA ATT GAA AAT 
2464 ENAEAAKGYVQKFI DS L I EN 

7835 AAA GAA GAA ATA ATC AGA TAT GGT TTG TGG GGA ACA CAC ACA GCA CTA TAC AAA AGC ATA 
2484 KEEIIRYGLWGTHTALYKSI 

7895 GCT GCA AGA CTG GGG CAT GAA ACA GCG TTT GCC ACA CTA GTG TTA AAG TGG CTA GCT TTT 
2504 AARLGHETAF ATLVLKWLAF 

7955 GGA GGG GAA TCA GTG TCA GAC CAC GTC AAG CAG GCG GCA GTT GAT TTA GTG GTC TAT TAT 
2524 GGESVSDHVKQAAVDLVVYY 



7594 
2403 

7654 
2423 

7714 
2443 

7774 
2463 

7834 
2483 

7894 
2503 

7954 
2523 

8014 
2543 



8015 GTG ATG AAT AAG CCT TCC TTC CCA GGT GAC TCC GAG ACA CAG CAA GAA GGG AGG CGA TTC 8074 
SFPGDSETQQEGRRF 2563 



2544 V M N K P S 

8075 GTC GCA AGC CTG TTC ATC TCC GCA CTG GCA ACC TAC ACA TAC AAA ACT TGG AAT TAC CAC 
2564 VASLFISALATYTYKTWNYH 

8135 AAT CTC TCT AAA GTG GTG GAA CCA GCC CTG GCT TAC CTC CCC TAT GCT ACC AGC GCA TTA 
2584 NLSKVVEPALAYLPYATSAL 

8195 AAA ATG TTC ACC CCA ACG COG CTG GAG AGC GTG GTG ATA CTG AGC ACC ACG ATA TAT AAA 
2604 KMFTPTRLESVVILSTT I YK 

8255 ACA TAC CTC TCT ATA AGG AAG GGG AAG ACT GAT GGA TTG CTG GGT ACG GGG ATA ACT GCA 
2624 TYLSIRKGKSDGLLGTGI SA 

8315 GCC ATG GAA ATC CTG TCA CAA AAC CCA GTA TCG GTA GGT ATA TCT GTG ATG TTG GGG GTA 
2644 AMEILSQNPVSV. GI SVMLGV 

8375 GGG GCA ATC GCT GCG CAC AAC GCT ATT GAG TCC AGT GAA CAG AAA AGG ACC CTA CTT ATG 
2664 GAIAAHNAIESSEQKRTLLM 

8435 AAG GTG TTT GTA AAG AAC TTC TTG GAT CAG GCT GCA ACA GAT GAG CTG GTA AAA GAA AAC 8494 
2684 KVFVKNFLDQAATDEL VKEN 2703 

8495 CCA GAA AAA ATT ATA ATG GCC TTA TTT GAA GCA GTC CAG ACA ATT GGT AAC CCC CTG AGA 8554 
2704 PEKIIMALFEAVQTIGNPLR 2723 

8555 CTA ATA TAC CAC CTG TAT GGG GTT TAC TAC AAA GGT TGG GAG GCC AAG GAA CTA TCT GAG 8614 
2724 LIYHLYGVYYKGWEAKELSE 2743 

8615 AGG ACA GCA GGC AGA AAC TTA TTC ACA TTG ATA ATG TTT GAA GCC TTC GAG TTA TTA GGG 8674 
2744 RTAGRNLFTLIMFEAFELLG 2763 

8675 ATG GAC TCA CAA GGG AAA ATA AGG AAC CTG TCC GGA AAT TAC ATT TTG GAT TTG ATA TAC 8734 
2764 MDSGGKIRNLSGNYILDLIY 2783 



8134 
2583 

8194 
2603 

8254 
2623 

8314 
2643 

8374 
2663 

8434 
2683 
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8735 GGC CTA CAC AAG CAA ATC AAC AGA GGG CTG AAG AAA 
27B4 GLHKQINRCLKK 

8795 CCC TTT AGT TGT GAC TGG ACC CCT AGT GAC GAG AGG 
2804 PFSCDWTPSDER 

8855 TTG AGG GTA GAA ACC AGG TGC CCA TGT GGC TAT GAG 
2824 LRVETRCPCGYE 

8915 GGC AAA CTT ACC AAA CTG GAG GAG AGC GGG CCT TTC 
2844 GKLTKVEESGPF 

8975 GGA CCA GTC AAC TAC AGA GTC ACC AAG TAT TAC GAT 
2864 GPVNYRVTKYYD 

9035 GTA GCA AAG TTG GAA GGA CAG GTA GAG CAC TAC TAC 
2884 VAKLEGQVEHYY 

9095 TAC AGT AAA GGA AAA ATG CTC TTG GCC ACT GAC AAG 
2904 YSKGKMLLATDK 

9155 ACC AGG TTA GCT AAG AGA TAT ACT GGG GTC GGG TTC 
2924 TRLAKRYTGVGF 

9215 CCC AAT CAC CGT GCT CTA GTG GAG AGG GAC TGT GCA 
2944 PNHRALVERDC A 

9275 TTT CTA AAA ATG AAG AAG GGG TGT GCG TTC ACC TAT 
2964 FLKMKKGCAFTY 

9335 AGG CTC ATC GAA CTA GTA CAC AGG AAC AAT CTT GAA 
2984 RLIELVHRNNLE 

9395 GTC ACC ACA TGG CTA GCT TAC ACC TTC GTG AAT GAA 
3004 VTTWLAYTFVNE 

9455 CTA GGA GAG AGA GTA ATC CCC GAC CCT GTA GTT GAT 
3024 LGERVI PDPVVD 

9515 GTG GAC ACG TCA GAG GTT GGG ATC ACA ATA ATT GGA 
3044 VDTSEVGITIIG 

9575 GTG ACA CCT GTC TTG GAA AAA GTA GAG CCT GAC GCC 
3064 VTPVLEKVEPDA 

9635 ATC GGG TTG GAT GAG GGT AAT TAC CCA GGG CCT GGA 
3084 IGLDEGNYPGPG 

9695 GAA ATA CAC AAC AGG GAT GCG AGG CCC TTC ATC ATG 
3104 EIHNRDARPFIM 

9755 TCA AAT AGG GCA AAG ACT GCT AGA AAT ATA AAT CTG 
3124 SNRAKTARNINL 

9815 ATA CGA GAC TTG ATG GCT GCA GGG CGC ATG TTA GTA 
3144 IRDLMAAGRMLV 

9875 GAG CTG TCT GAA ATG GTC GAT TTC AAG GGG ACT TTT 
3164 ELSEMVDFKGTF 

9935 CTA AGT CTC GGG CAA CCT AAA CCG AAG CAG GTT ACC 
3184 LSLGQPKPKQVT 

9995 GAA CAG AAA AAA GAT GTG GAG ATC CCT AAC TGG TTT 
3204 EQKKDVEIPNWF 

10055 GAA GTG GCC TTA AAA AAT GAT AAG TAC TAC TTA GTA 
3224 EVALKNDKYYLV 

10115 CAA GCT AAA GCA CTT GGG GCC ACG GAT CAG ACA AGA 
3244 QAKALGATOQTR 

10175 ACG TAT GCC ATG AAG CTA TCT AGC TGG TTC CTC AAG 
3264 TYAMKLSSWFLK 

10235 ACT CCA CTG TTT GAG GAA TTG TTG CTA COG TGC CCA 
3284 TPLFEELLLRCP 

10295 CAC ATG GCA TCA GCT TAC CAA TTG GCA CAG GGT AAC 
3304 HMASAYQLAQGN 

10355 CAC CTA GGT ACA ATA CCA GCC AGA AGG GTG AAG ATA 
3324 HLGTI PARRVKI 

10415 TTG AAA GAT TTC ATA GAA GAA GAA GAG AAG AAA CCT 
3344 LKDFIEEEEKKP 
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ATC 
I 


AGA 
R 


TTG 
L 


CCA 
P 


ACA 
T 


GAC 
D 


AAC 
N 


TAT 
Y 


8854 
2823 




ATG 
M 


AAA 
K 


GCT 
A 


TTC 
F 


AAA 
K 


AAT 
N 


GTA 
V 


GGT 

G 


8914 
284 3 




CTA 
L 


TGT 
C 


AGA 
R 


AAC 
N 


AGA 
R 


CCT 
P 


GGT 
G 


AGG 
R 


8974 
2863 




GAC 
D 


AAC 
N 


CTC 
L 


AGA 
R 


GAG 
E 


ATA 
I 


AAA 
K 


CCA 
P 


9034 
2883 




AAA 


GGG 
G 


GTC 
V 


ACA 
T 


GCA 
A 


AAA 
K 


ATT 
I 


GAC 
□ 


9094 
2903 




TGG 

W -. 


GAG 
E 


GTG 
V 


GAA 

E 


CAT 
H 


GGT 
G 


GTC 
V 


ATA 
I 


9154 
2923 




AAT 
N 


GGT 
G 


GCA 
A 


TAC 

Y 


TTA 
L 


GGT 
G 


GAC 
D 


GAG 
E 


9214 
2943 




ACT 
T 


ATA 
I 


ACC 
T 


AAA 
K 


AAC 
N 


ACA GTA 
T V 


CAG 
0 


9274 
2963 




GAC 

D ,; 


CTG 
L 


ACC 
T 


ATC 
I 


TCC 
S 


AAT 
N 


CTG 
L 


ACC 
T 


9334 
2983 




gag' 

b 


AAG 

K 


GAA 
E 


ATA 
I 


CCC 
p 


ACC 
T 


GCT 
A 


ACG 
T 


9394 
3003 




GAC 
D 


GTA GGG 
V G 


ACT 
T 


ATA 

I 


AAA CCA GTA 
K P V 


9454 
3023 




ATC 
I 


AAT 
N 


TTA 

L 


CAA 
Q 


CCA 
P 


GAG 
E 


GTG 
V 


CAA 
Q 


9514 
3043 




AGG 
R 


GAA 
E 


ACC 
T 


CTG 
L 


ATG 
M 


ACA ACG 
T T 


GGA 
G 


9574 
3063 


>o 

1 


AGC 

S 


GAC AAC 
D N 


CAA 
Q 


AAC 

N 


TCG 

S 


GTG 
V 


AAG 

K 


9634 
3083 




ATA 
I 


CAG 
Q 


ACA 
T 


CAT 
H 


ACA 
T 


CTA 
L 


ACA GAA 
T E 


9694 
3103 




ATC 
I 


CTG 
L 


GGC 
G 


TCA 
S 


AGG 
R 


AAT 
N 


TCC 
S 


ATA 
I 


9754 
3123 


O 


TAC 
Y 


ACA GGA 
T G 


AAT 
N 


GAC 
D 


CCC 

p 


AGG 
R 


GAA 
E 


9814 
3143 




GTA 
V 


GCA CTG 
A L 


AGG 
R 


GAT GTC 
D V 


GAC 
D 


CCT 
P 


9874 
3163 




TTA 
L 


GAT 
D 


AGG 
R 


GAG 
E 


GCC 
A 


CTG 
L 


GAG 
E 


GCT 
A 


9934 
3183 




AAG 

K 


GAA GCT 
E A 


GTT 
V 


AGG 
R 


AAT 
N 


TTG 

L 


ATA 
I 


9994 
3203 




GCA 
A 


TCA GAT 
S 0 


GAC 
D 


CCA GTA TTT 
P V F 


CTG 
L 


10054 
3223 




GGA 
G 


GAT GTT 
D V 


GGA 
G 


GAG 
E 


CTA 
L 


AAA GAT 
K D 


10114 
3243 




ATT 
I 


ATA 
I 


AAG 

K 


GAG 
E 


GTA GGC 
V G 


TCA AGG 
S R 


10174 
3263 




GCA 
A 


TCA 
S 


AAC 
N 


AAA 
K 


CAG 
Q 


ATG 
M 


AGT 

S 


TTA 
L 


10234 
3283 




CCT 
P 


GCA 
A 


ACT 
T 


AAG 
K 


AGC 
S 


AAT 
N 


AAG 
K 


GGG 
G 


10294 
3303 




TGG 
W 


GAG 
E 


CCC 

P 


CTC 
L 


GGT 
G 


TGC 
C 


GGG 
G 


GTG 
V 


10354 
3323 




CAC 
H 


CCA TAT 
P Y 


GAA 
E 


GCT TAC 
A Y 


CTG 
L 


AAG 
K 


10414 
3343 




AGG 
R 


GTT 

V 


AAG 
K 


GAT 
D 


ACA GTA 
T V 


ATA 
I 


AGA 
R 


10474 
3363 
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10475 GAG CAC AAC AAA TGG 
3364 E H • N K W 



A AAA ATA AGG TT 
K 1 R F 



CAA GGA AAC CTC AAC ACC AAG AAA 
QGNLNTKK 



10535 ATG CTC AAC CCG GGG AAA CI' A TCT GAA CAG TTG GAC AGG GAG GGG CGC AAG AGG AAC ATC 
3384 MLNPCKLSEQLDREGRKRNI 

10595 TAC AAC CAC CAG ATT GGT ACT ATA ATG TCA AGT GCA GGC ATA AGG CTG GAG AAA TTG CCA 
3404 Y'NHQIGTIMSSAGIRLEKLP 

10655 ATA GTG AGG GCC CAA ACC GAC ACC AAA ACC TTT CAT GAG GCA ATA AG A GAT AAG ATA GAC 
3424 IVRAQTDTKTFHEAI RDKI D 

10715 AAG AGT GAA AAC CGG CAA AAT CCA GAA TTG CAC AAC AAA TTG TTG GAG ATT TTC CAC ACG 
3444 K S' ENRQ N P E L H NK L L E I F H T 

10775 ATA GCC CAA CCC ACC CTG AAA CAC ACC TAC GGT GAG GTG ACG TGG GAG CAA CTT GAG GCG 
3464 IAQP TLK HTYGEVTWEQLEA 

10835 GGG ATA AAT AGA AAG GGG GCA GCA GGC TTC CTG GAG AAG AAG AAC ATC GGA GAA GTA TTG 
3484 GINRKGAAGFLEKKNIGEVL 

10895 GAT TCA GAA AAG CAC CTG GTA GAA CAA TTG GTC AGG GAT CTG AAG GCC GGG AGA AAG ATA 
3504 DSEKHLVEQ LVRDLKAGRK I 

10955 AAA TAT TAT GAA ACT GCA ATA CCA AAA AAT GAG AAG AGA GAT GTC AGT GAT GAC TGG CAG 
3524 KYYETAI PKNEKRDVSDDWQ 

11015 GCA GGG GAC CTG GTG GTT GAG AAG AGG CCA AGA GTT ATC CAA TAC CCT GAA GCC AAG ACA 
3544 AG DLVVEKR PRVIQY PEAKT 

11075 AGG CTA GCC ATC ACT AAG GTC ATG TAT AAC TGG GTG AAA CAG CAG CCC GTT GTG ATT CCA 
3564 RLAITKVMYNWVKQQPVVI P 

11135 GGA TAT GAA GGA AAG ACC CCC TTG TTC AAC ATC TTT GAT AAA GTG AGA AAG GAA TGG GAC 
3584 GYEGKTPLFNI F DKVRKEWD 

11195 TCG TTC AAT GAG CCA GTG GCC GTA AGT TTT GAC ACC AAA GCC TGG GAC ACT CAA GTG ACT 
3604 SFNEPVAVS FDTKAWDTQVT 

11255 AGT AAG GAT CTG CAA CTT ATT GGA GAA ATC CAG AAA TAT TAC TAT AAG AAG GAG TGG CAC 
3624 SKDLQLIGE IQKYYYKKEWH 

11315 AAG TTC ATT GAC ACC ATC ACC GAC CAC ATG ACA GAA GTA CCA GTT ATA ACA GCA GAT GGT 
3644 KFIDTITDHMTEVPVITADG 

11375 GAA GTA TAT ATA AGA AAT GGG CAG AGA GGG AGC GGC CAG CCA GAC ACA AGT GOT GGC AAC 
3664 EVYIRNGQRGSGQPDTSAGN 

11435 AGC ATG TTA AAT GTC CTG ACA ATG ATG TAC GGC TTC TGC GAA AGC ACA GGG GTA CCG TAC 
3684 SMLNVLTMMYGFCESTGVPY 

11495 AAG AGT TTC AAC AGG GTG GCA AGG ATC CAC GTC TGT GGG GAT GAT GGC TTC TTA ATA ACT 
3704 KSFNRVARI HVCGDDGFLIT 

11555 GAA AAA GGG TTA GGG CTG AAA TTT GCT AAC AAA GGG ATG CAG ATT CTT CAT GAA GCA GGC 
3724 EKGLGLKFANKGMQI LHEAG 

11615 AAA CCT CAG AAG ATA ACG GAA GGG GAA AAG ATG AAA GTT GCC TAT AGA TTT GAG GAT ATA 
3744 KPQKITEGEKMKVAYRFEDI 

11675 GAG TTC TGT TCT CAT ACC CCA GTC CCT GTT AGG TGG TCC GAC AAC ACC AGT AGT CAC ATG 
3764 EFCSHTPVPVRWSDNTSSHM 



10534 
3383 

10594 
3403 

10654 
3423 

10714 
3443 

10774 
3463 

10834 
3483 

10894 
3503 

10954 
3523 

11014 
3543 

11074 
3563 

11134 
3583 

11194 
3603 

11254 
3623 

11314 
3643 

11374 
3663 

11434 
3683 

11494 
3703 

11554 
3723 

11614 
3743 

11674 
3763 

11734 
3783 



11735 GCC GGG AGA GAC ACC GCT GTG ATA CTA TCA AAG ATG GCA ACA AGA TTG GAT TCA AGT GGA 11794 
3784 AGRDTAVI LSKMATRLDSSG 3803 



11795 GAG AGG GGT ACC ACA GCA TAT GAA AAA GCG GTA GCC TTC AGT TTC TTG CTG ATG TAT TCC 

-------------- s 



3804 E 


R 


G 


T T 


A 


Y 


E K 


A 


V 


A 


F 


S 


F 


L 


L M 


Y 


11855 TGG 
3824 W 


AAC 
N 


CCG 
P 


CTT GTT 
L V 


AGG 
R 


AGG 

R 


ATT TGC 

r c 


CTG 
L 


TTG 
L 


GTC 
V 


CTT 
L 


TCG 
S 


CAA 
Q 


CAG 
Q 


CCA GAG 
P E 


ACA 
T 


11915 CCA TCA 
3844 P 5 


AAA 
K 


CAT GCC 
H A 


ACT 
T 


TAT 
Y 


TAT TAC 
Y Y 


AAA 

K 


GGT 
G 


GAT 
0 


CCA 
P 


ATA 
I 


GGG 
G 


GCC 
A 


TAT AAA 

Y K - 


GAT 
D 


11975 ATA 
3864 I 


GGT 

G 


CGG 
R 


AAT CTA 
N L 


AGT 
S 


GAA 
E 


CTG AAG 
L K 


AGA 
R 


ACA GGC 
T G 


TTT 
F 


GAG 
E 


AAA 

K 


TTG 
L 


GCA AAT 
A N 


CTA 
L 


12035 CTA 
3884 L 


AGC 
S 


CTG 
L 


TCC ACG 
S T 


TTG 
L 


GGG 
G 


ATC TGG 
I W 


ACT 
T 


AAG 
K 


CAC 
H 


ACA 
T 


AGC 
S 


AAA 
K 


AGA 
R 


ATA ATT 
I I 


CAG 
0 



12095 TGT GTT GCC ATT GGG AAA GAA GAG GGC AAC TGG CTA GTT AAC GCC GAC AGG CTG ATA TCC 
3904 CVAICKEEGNWLVNADRLIS 

12155 AGC AAA ACT GGC CAC TTA TAC ATA CCT GAT AAA GGC TTT ACA TTA CAA GGA AAG CAT TAT 
3924 SKTGHLY I PDKGFTLQGKHY 



11854 
3823 

11914 
3843 

11974 
3863 

12034 
3883 

12094 
3903 

12154 
3923 

12214 
3943 



3 

P 
O 

to 



r \ 
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12215 GAG CAA CTG CAG CTA AGA ACA GAG ACA AAC CCG GTC ATG GGG GTT GGG ACT GAG AGA TAC 12274 
3944 EQ.LQLRTETNPVMGVGTERY 3963 

12275 AAG TTA GGT CCC ATA GTC AAT CTG CTG CTG AGA AGG TTG AAA ATT CTG CTC ATG ACC GCC 12334 
3964 KLGPIV 'NLLLRRLKILLMTA 3983 

12335 GTC GGC GTC AGC AGC TGA gacaaaacgtatacaccgtaaataaactaatccacgtacatagtgtacataaatat 12408 

3984 V G V S S ' 3989 

12409 agctgggaccgcccacctcaagaagacgacacgcccaacacgcacagccaaacagtagccaagaccatccaccccaagac 12488 

12489 aacactacacccaacgcacacagcaccccagccgcacgaggatacgcccgacgcccacagccggaccagggaagaccccc 12568 

12569 aacagccccc 12^78 
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DNA sequence 12308 b.p. gtatacgagaat ... ctaacagccccc linear 

1 gtatacgagaattagaaaaggcactcgcatacgtattgggcaattaaaaacaataaccaggcctagggaacaaatccctc 80 

81 tcagcgaaggccgaaaagaggctagccacgccctcagtaggaccagcataacgaggggggtagcaacagtggtgagttcg 160 

161 ttggatggctcaagccctgagtacagggtagtcgtcagtggttcgacgccttggaataaaggtcccgagatgccacgtgg 240 

241 acgagggcacgcccaaagcacaccccaacctgagcgggggtcgcccaggcaaaagcagttccaaccgaccgccacgaaca 320 

321 cagcccgacagggcgccgcagaggcccactgcattgccaccaaaaacccccgctgcacacggcac ATG GAG TTG 394 

1 MEL 3 

395 ATC ACA AAT GAA CTT TTA TAC AAA ACA TAC AAA CAA AAA CCC GTC GGG GTG GAG GAA CCT 454 

4ITNELLYKTYKQKPVGVEEP 23 

455 GTT TAT GAT CAG GCA GGT GAT CCC TTA TTT GGT GAA AGG GGA GCA GTC CAC CCT CAA TCG 514 

24VYDQAGDPLFGERGAVHPQS 43 

515 ACG CTA AAG CTC CCA CAC AAG AGA GGG GAA CGC GAT GTT CCA ACC AAC TTG GCA TCC TTA 574 

44T LKLPHKRGERDVPTNLASL 63 

575 CCA AAA AGA GGT GAC TGC AGG TCG GGT AAT AGC AGA GGA CCT GTG AGC GGG ATC TAC CTG 634 

64PKRGDCRSGNSRGPVSGIYL 83 

635 AAG CCA GGG CCA CTA TTT TAC CAG GAC TAT AAA GGT CCC GTC TAT CAC AGG GCC CCG CTG 694 

84KPG PLFYQDYKG PVYHRAP L 103 

695 GAG CTC TTT GAG GAG GGA TCC ATG TGT GAA ACG ACT AAA CGG ATA GGG AGA GTA ACT GGA 754 

104 ELFEEGSMCETTKRIGRVTG 123 

755 ACT GAC GGA AAG CTG TAC CAC ATT TAT GTG TGT ATA GAT GGA TGT ATA ATA ATA AAA ACT 814 

124 SDGKLYHIYVCIDGCI I IKS 143 ' ^ 

815 GCC ACG AGA AGT TAC CAA AGG GTG TTC AGG TGG GTC CAT AAT AGG CTT GAC TGC CCT CTA 874 ^ 

144ATRSYQRVFRWVHNRLDCP L 163 

875 TGG GTC ACA ACT TGC TCA GAC ACG AAA GAA GAG GGA GCA ACA AAA AAG AAA ACA CAG AAA 934 &3 

164WVTTCSDTKEEGATKKKTQK 183 

P 

935 CCC GAC AGA CTA GAA AGG GGG AAA ATG AAA ATA GTG CCC AAA GAA TCT GAA AAA GAC AGC 994 r* 

184 PDRLERGKMKIVPKESEKDS 203 j£ 

fa 

995 AAA ACT AAA CCT CCG GAT GCT ACA ATA GTG GTG GAA GGA GTC AAA TAC CAG GTG AGG AAG 1054 

204 KTKPPDATIVVEGVKYQVRK 223 

1055 AAG GGA AAA ACC AAG AGT AAA AAC ACT CAG GAC GGC TTG TAC CAT AAC AAA AAC AAA CCT 1114 

224 KGKTKSKNTQDGLYHNKNKP 243 

1115 CAG GAA TCA CGC AAG AAA CTG GAA AAA GCA TTG TTG GCC TGG GCA ATA ATA GCT ATA GTT 1174 

244 QESRKKLEKALLAWAI I AIV 263 

1175 TTG TTT CAA GTT ACA ATG GGA GAA AAC ATA ACA CAG TGG AAC CTA CAA GAT AAT GGG ACG 1234 

264 LFQVT MGENITQWNLQDNGT 283 

1235 GAA GGG ATA CAA CGG GCA ATG TTC CAA AGG GGT GTG AAT AGA AGT TTA CAT GGA ATC TGG 1294 

284 EGIQRAMFQRGVNRS LH G IW 303 

1295 CCA GAG AAA ATC TGT ACT GGT GTC CCT TCC CAT CTA GCC ACC GAT ATA GAA CTA AAA ACA 1354 

304 PEKICTGVPSHLATDI ELKT 323 

1355 ATT CAT GGT ATG ATG GAT GCA AGT GAG AAG ACC AAC TAC ACG TGT TGC AGA CTT CAA CGC 1414 

324 IHGMMDASEKTNYTCCRLQR 343 

1415 CAT GAG TGG AAC AAG CAT GGT TGG TGC AAC TGG TAC AAT ATT GAA CCC TGG ATT CTA GTC 1474 

344 HEWNKHGWCNWYNI EPWI L V 363 

1475 ATG AAT AGA ACC CAA GCC AAT CTC ACT GAG GGA CAA CCA CCA AGG GAG TGC GCA GTC ACT 1534 

364 MNRT0ANLTEGQPPRECAVT 383 

1535 TGT AGG TAT GAT AGG GCT AGT GAC TTA AAC GTG GTA ACA CAA GCT AGA GAT AGC CCC ACA 1594 

384 CRYDRASDLNVVTQARDSPT 403 

1595 CCC TTA ACA GGT TGC AAG AAA GGA AAG AAC TTC TCC TTT GCA GGC ATA TTG ATG CGG GGC 1654 

404 PLTGCKKGKNFSFAGI LMRG 423 

1655 CCC TGC AAC TTT GAA ATA GCT GCA AGT GAT GTA TTA TTC AAA GAA CAT GAA CGC ATT ACT 1714 

424 PCNFEIAASDVLFKEHERIS 443 

1715 ATG TTC CAG GAT ACT ACT CTT TAC CTT GTT GAC GGG TTG ACC AAC TCC TTA GAA GGT GCC 1774 

444 MFQDTTLYLVDGLTNS L E G A 463 
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1775 AGA CAA CCA ACC CCT AAA CTG ACA ACC TGG TTA CGC AAC CAG CTC GGG ATA CTA GGA AAA 1831 

464 R Q G T A K LTT WLG KO LG I L G K 483 

183S AAG TTG CAA AAC AAG ACT AAG ACG TGG TTT GGA GCA TAC CCT GCT TCC CCT TAC TGT GAT 1894 

484 KLENKS'KTWFGAYAAS PYCD 503 

1895 GTC GAT CGC AAA ATT GGC TAC ATA TGG TAT ACA AAA AAT TGC ACC CCT GCC TCC TTA CCC 1954 

504 VDRKIGYIWYTKNCTPACLP 523 

1955 AAG AAC ACA AAA ATT GTC GGC CCT GGG AAA TTT GAC ACC AAT GCA GAG GAC GGC AAG ATA 2014 

524 KNTKIVGPGKFDTNAEDGKI 543 

2015 TTA CAT GAG ATG GGG GGT CAC TTG TCC GAG GTA CTA CTA CTT TCT TTA GTG GTG CTG TCC 2074 

544 LHEMGGHLSEVLLLSLVVLS 563 

2075 GAC TTC GCA CCG GAA ACA GCT AGT GTA ATG TAC CTA ATC CTA CAT TTT TCC ATC CCA CAA 2134 

564 DFAPETASVMYLI LHFSI PQ 583 

2135 AGT CAC GTT GAT GTA ATG GAT TGT GAT AAG ACC CAG TTG AAC CTC ACA GTG GAG CTG ACA 2194 

584 SHVDVMDCDKTQLNLT VELT 603 

2195 ACA GCT GAA GTA ATA CCA GGG TCG GTC TGG AAT CTA GGC AAA TAT GTA TGT ATA AGA CCA 2254 

604 TAEVI PGSVWNLGKYVC I RP 623 

2255 AAT TGG TGG CCT TAT GAG ACA ACT GTA GTG TTG GCA TTT GAA GAG GTG AGC CAG GTG GTG 2314 

624 NWWPYETT VVLAFEEVSQVV 643 

2315 AAG TTA GTG TTG AGG GCA CTC AGA GAT TTA ACA CGC ATT TGG AAC GCT GCA ACA ACT ACT 2374 

644 KLVLRALRDLTRIWNAATT T 663 

2375 GCT TTT TTA GTA TGC CTT GTT AAG ATA GTC AGG GGC CAG ATG GTA CAG GGC ATT CTG TGG 2434 

664 AFLVCLVKIVRGQMVQGI LW 6B3 

2435 CTA CTA TTG ATA ACA GGG GTA CAA GGG CAC TTG GAT TGC AAA CCT GAA TTC TCG TAT GCC 2494 

684 LLLITGVQGHLD.CKPEFSYA 703 

2495 ATA GCA AAG GAC GAA AGA ATT GGT CAA CTG GGG GCT GAA GGC CTT ACC ACC ACT TGG AAG 2554 

704 IAKDER IGQLGAEGLTTTWK 723 

2555 GAA TAC TCA CCT GGA ATG AAG CTG GAA GAC ACA ATG GTC ATT GCT TGG TGC GAA GAT GGG 2614 

724 EYSPGMKLEDTMVI AWCEDG 743 



i 



2615 AAG TTA ATG TAC CTC CAA AGA TGC ACG AGA GAA ACC AGG TAT CTC GCA ATC TTG CAT ACA 2674 

744 KLMYLQRCTRETRYLAI L.HT 763 

2675 AGA GCC TTG CCG ACC AGT GTG GTA TTC AAA AAA CTC TTT GAT GGG CCA AAG CAA GAG GAT 2734 

764 RAL PTSVVFKKLFDGRKQED 783 

2735 GTA GTC GAA ATG AAC GAC AAC TTT GAA TTT GGA CTC TGC CCA TGT GAT GCC AAA CCC ATA 2794 

784 VVEMNDNFEFGLCPCDAKP I 803 3 

o 

2795 GTA AGA GGG AAG TTC AAT ACA ACG CTG CTG AAC GGA CCG GCC TTC CAG ATG GTA TGC CCC 2854 m 

804 VRGKFNTTLLNGPAFQMVCP 823 ti* 

2855 ATA GGA TGG ACA GGG ACT GTA AGC TGT ACG TCA TTC AAT ATG GAC ACC TTA GCC ACA ACT 2914 

824 IGWTGTVSCTSFNMDTLATT 843 

2915 GTG GTA CGG ACA TAT AGA AGG TCT AAA CCA TTC CCT CAT AGG CAA GGC TGT ATC ACC CAA 2974 

844 VV RTYRRSKPFPHRQGC ITQ 863 

2975 AAG AAT CTG GGG GAG GAT CTC CAT AAC TGC ATC CTT GGA GGA AAT TGG ACT TGT GTG CCT 3034 

864 KNLGEDLHNCILG GNWTCVP 883 

3035 GGA GAC CAA CTA CTA TAC AAA GGG GGC TCT ATT GAA TCT TGC AAG TGG TGT GGC TAT CAA 3094 

884 GDQLLYKGGSI ESCKWCGYQ 903 

3095 TTT AAA GAG AGT GAG GGA CTA CCA CAC TAC CCC ATT GGC AAG TGT AAA TTG GAG AAC GAG 3154 

904 FKESEGLPHYPI GKCKLENE 923 

3155 ACT GGT TAC AGG CTA GTA GAC AGT ACC TCT TGC AAT AGA GAA GGT GTG GCC ATA GTA CCA 3214 

924 TGYRLVDSTSCNREGVAI VP 943 

3215 CAA GGG ACA TTA AAG TGC AAG ATA GGA AAA ACA ACT GTA CAG GTC ATA GCT ATG GAT ACC 3274 

944 QGTLKCKIGKTTVQVIAMDT 963 

3275 AAA CTC GGA CCT ATG CCT TGC AGA CCA TAT GAA ATC ATA TCA AGT GAG GGG CCT GTA GAA 3334 

964 KLGPMPCRPYEI ISSEGPVE 983 

3335 AAG ACA GCG TGT ACT TTC AAC TAC ACT AAG ACA TTA AAA AAT AAG TAT TTT GAG CCC AGA 3394 

984 K T A CT F NY TKT LK NK Y F E P R 1003 

3395 GAC AGC TAC TTT CAG CAA TAC ATG CTA AAA GGA GAG TAT CAA TAC TGG TTT GAC CTG GAG 3454 
1004 DSYFQQYMLKGEYQYWFDLE 1023 

3455 GTG ACT GAC CAT CAC CGG GAT TAC TTC GCT GAG TCC ATA TTA GTG GTG GTA GTA GCC CTC 3514 
1024 VTDHHRDYFAESILVVVVAL 1043 
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3515 TTG GGT GCC AGA TAT GTA CTT TOG TTA CTG GTT ACA TAC ATG GTC TTA TCA GAA CAG AAG 3574 
1044 LCGRYVLWLLVTYMVLSEQK 1063 



3575 GCC TTA GGG ATT CAG TAT GGA TCA GGG GAA GTG GTG ATG ATG GGC AAC TTG CTA ACC CAT 
1064 ALG I Q Y G SGEVVMMGNLLTH 



3815 TTG GGG AAA ATA GAC CTC TGT TTT ACA ACA GTA GTA CTA ATC GTC ATA GGT TTA ATC ATA 
1144 LGK IDLCFTTVVLIVIGL I I 

3875 GCC AGG CGT GAC CCA ACT ATA GTG CCA CTG GTA ACA ATA ATG GCA GCA CTG AGG GTC ACT 
1164 ARRDPTI VPLVTIMAALRVT 



3995 ATG GTT AGC TAT GTG ACA GAT TAT TTT AGA TAT AAA AAA TOG TTA CAG TGC ATT CTC AGC 
1204 MVSYVTDYFRYKKWLQCI LS 

4055 CTG GTA TCT GCG GTG TTC TTG ATA AGA AGC CTA ATA TAC CTA GGT AGA ATC GAG ATG CCA 
1224 LVSAVFLIRSLIYLGRIEMP 

4115 GAG GTA ACT ATC CCA AAC TGG AGA CCA CTA ACT TTA ATA CTA TTA TAT TTG ATC TCA ACA 
1244 EVTI PNWRPLTLILLYLI ST 



3634 
1083 



3635 AAC AAT ATT GAA GTG GTG ACA TAC TTC TTG CTG CTG TAC CTA CTG CTG AGG GAG GAG AGC 3694 

1084 NNI EVVTYFLLLYLLLRE ES 1103 

3695 GTA AAG AAG TGG GTC TTA CTC TTA TAC CAC ATC TTA GTG GTA CAC CCA ATC AAA TCT GTA 3754 

1104 VKKWVLLLYHILVVHPIKSV 1123 

3755 ATT GTG ATC CTA CTG ATG ATT GGG GAT GTG GTA AAG GCC GAT TCA GGG GGC CAA GAG TAC 3814 

1124 IVI LLMI GDVVKADSGGQ EY 1143 



3874 
1163 



3934 
1183 



3935 GAA CTG ACC CAC CAG CCT GGA GTT GAC ATC GCT GTG GCG GTC ATG ACT ATA ACC CTA CTG 3994 
1184 ELTHQPGVDIAVAVMTIT L L 1203 



4054 
1223 



4114 
1243 



4174 
1263 



4175 ACA ATT GTA ACG AGG TGG AAG GTT GAC GTG GCT GGC CTA TTG TTG CAA TGT GTG CCT ATC 4234 

1264 TIVTRWKVDVAG LLLQCVPI 1283 

4235 TTA TTG CTG GTC ACA ACC TTG TGG GCC GAC TTC TTA ACC CTA ATA CTG ATC CTG CCT ACC 4294 

1284 LLLVTTLWADFLTLILI L PT 1303 

4295 TAT GAA TTG GTT AAA TTA TAC TAT CTG AAA ACT GTT AGG ACT GAT ATA GAA AGA AGT TGG 4354 

1304 YELVKLYYLKTVRTDIERSW 1323 

4355 CTA GGG GGG ATA GAC TAT ACA AGA GTT GAC TCC ATC TAC GAC GTT GAT GAG AGT GGA GAG 4414 

1324 LGG IDYTRVDSIYDVDESGE 1343 

4415 GGC GTA TAT CTT TTT CCA TCA AGG CAG AAA GCA CAG GGG AAT TTT TCT ATA CTC TTG CCC 4474 

1344 GVY LFPSRQKAQGNFSI L LP 1363 

4475 CTT ATC AAA GCA ACA CTG ATA AGT TGC GTC AGC AGT AAA TGG CAG CTA ATA TAC ATG AGT 4534 

1364 LI K A T L I SCVSSKWQLIYMS 1383 

4535 TAC TTA ACT TTG GAC TTT ATG TAC TAC ATG CAC AGG AAA GTT ATA GAA GAG ATC TCA GGA 4594 

1384 YLTLDFMYYMKRKVIEEI SG 1403 

4595 GGT ACC AAC ATA ATA TCC AGG TTA GTG GCA GCA CTC ATA GAG CTG AAC TGG TCC ATG GAA 4654 

1404 GTNIISRLVAALIELNWSME 1423 



4655 GAA GAG GAG AGC AAA GGC TTA AAG AAG TTT TAT CTA TTG TCT GGA AGG TTG AGA AAC CTA 
1424 EEESKGLKKFYLLSGRLRNL 



4714 
1443 



4715 ATA ATA AAA CAT AAG GTA AGG AAT GAG ACC GTG GCT TCT TGG TAC GGG GAG GAG GAA GTC 4774 

1444 I I K HKVRNETVAS WYGEE EV 1463 

4775 TAC GGT ATG CCA AAG ATC ATG ACT ATA ATC AAG GCC AGT ACA CTG AGT AAG AGC AGO CAC 4834 

1464 YGMPKIMTI IKASTLSKSRH 1483 

4835 TGC ATA ATA TGC ACT GTA TGT GAG GGC CGA GAG TGG AAA GGT GGC ACC TGC CCA AAA TGT 4894 

1484 CI I CTVC EGREW KG GTC P KC 1503 

4895 GGA CGC CAT GGG AAG CCG ATA ACG TGT GGG ATG TCC CTA GCA GAT TTT GAA GAA AGA CAC 4954 

1504 GRHGKPITCGMSLADFEE RH 1523 

4955 TAT AAA AGA ATC TTT ATA AGG GAA GGC AAC TTT GAG gggccc TTC AGG CAG GAA TAC AAT 5014 

1524 YKR IFIREGNFE FRQEYN 1541 

5015 GGC TTT GTA CAA TAT ACC GCT AGG GGG CAA CTA TTT CTG AGA AAC TTG CCC GTA CTG GCA 5074 

1542 GFVQYTA RGQLFLRNLPV LA 1561 



S075 ACT AAA GTA AAA ATG CTC ATG GTA GGC AAC CTT GGA GAA GAA ATT GGT AAT CTG GAA CAT 
1562 TKVKMLMVGNLGEE IGNLEH 



5134 
1581 



5135 CTT GGG TGG ATC CTA AGG GGG CCT GCC GTG TGT AAG AAG ATC ACA GAG CAC GAA AAA TGC 5194 

1582 LGWILRGPAVCKKITEHEKC 1601 

5195 CAC ATT AAT ATA CTG GAT AAA CTA ACC CCA TTT TTC GGG ATC ATG CCA AGG GGG ACT ACA 5254 

1602 HI N I LDKLTAFFGI MPRGTT 1621 
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5255 CCC AGA GCC CCG GTG AGG TTC CCT ACG AGC TTA CTA AAA GTG AGG AGG GGT CTG GAG ACT 
1622 PRAPVRFPTSLLKVRRGLET 

5315 GCC TGG GCT TAC ACA CAC CAA GGC GGG ATA AGT TCA GTC GAC CAT GTA ACC GCC GGA AAA 
1642 AWAYTH QGGI SSVDHVTAGK 

5375 GAT CTA CTG GTC TGT GAC AGC ATG GGA CGA ACT AGA GTG GTT TGC CAA AGC AAC AAC AGG 
1662 DLLVCDSMGRTRVVCQSNNR 

5435 TTG ACC GAT GAG ACA GAG TAT GGC GTC AAG ACT GAC TCA GGG TGC CCA GAC GGT GCC AGA 
1682 LTDETEYGVKTDSGC PDGAR 

5495 TGT TAT GTG TTA AAT CCA GAG GCC GTT AAC ATA TCA GGA TCC AAA GGG GCA GTC GTT CAC 
1702 CYVLNPEAVNISGSKGAVVH 

5555 CTC CAA AAG ACA GGT GGA GAA TTC ACG TGT GTC ACC GCA TCA GGC ACA CCG GCT TTC TTC 
1722 LQKTGGEFTCVTASGTPAFF 

5615 GAC CTA AAA AAC TTG AAA GGA TGG TCA GGC TTG CCT ATA TTT GAA GCC TCC AGC GGG AGG 
1742 DLKNLKGWSGLPI FEASSGR 

5675 GTG GTT GGC AGA GTC AAA GTA GGG AAG AAT GAA GAG TCT AAA CCT ACA AAA ATA ATG AGT 
1762 VVGRVKVGKNEESKPTKIMS 

5735 GGA ATC CAG ACC GTC TCA AAA AAC AGA GCA GAC CTG ACC GAG ATG GTC AAG AAG ATA ACC 
1782 GIQTVS KNRADLTEMVKKIT 

5795 AGC ATG AAC AGG GGA GAC TTC AAG CAG ATT ACT TTG GCA ACA GGG GCA GGC AAA ACC ACA 
1802 SMNRGDF KQI TLAT GAGKTT 

5855 GAA CTC CCA AAA GCA GTT ATA GAG GAG ATA GGA AGA CAC AAG AGA GTA TTA GTT CTT ATA 
1822 EL.PKAVI EEIGRHKRVLVLI 

5915 CCA TTA AGG GCA GCG GCA GAG TCA GTC TAC CAG TAT ATG AGA TTG AAA CAC CCA AGC ATC 
1842 PLRAAAESVYQY MRLKHPS I 

5975 TCT TTT AAC CTA AGG ATA GGG GAC ATG AAA GAG GGG GAC ATG GCA ACC GGG ATA ACC TAT 
1862 SFNLRIGDMKEGDMATGITY 

6035 GCA TCA TAC GGG TAC TTC TGC CAA ATG CCT CAA CCA AAG CTC AGA GCT GCT ATG GTA GAA 
1882 ASYGYFCQMPQPKLRAAMVE 

6095 TAC TCA TAC ATA TTC TTA GAT GAA TAC CAT TGT GCC ACT CCT GAA CAA CTG GCA ATT ATC 
1902 YSYIFLDEYHCATPEQLAI I 

61S5 GGG AAG ATC CAC AGA TTT TCA GAG AGT ATA AGG GTT GTC GCC ATG ACT GCC ACG CCA GCA 
1922 GKIHRFSESIRVVAMTATPA 

6215 GGG TCG GTG ACC ACA ACA GGT CAA AAG CAC CCA ATA GAG GAA TTC ATA GCC CCC GAG GTA 
1942 GSVTTTGQKHPIE EF IAPEV 

6275 ATG AAA GGG GAG GAT CTT GGT AGT CAG TTC CTT GAT ATA GCA GGG TTA AAA ATA CCA GTG 
1962 MKGEDLGSQFLDIAGLKIPV 

6335 GAT GAG ATG AAA GGC AAT ATG TTG GTT TTT GTA CCA ACG AGA AAC ATG GCA GTA GAG GTA 
1982 DEMKGNMLVFVPTRNMAVEV 

6395 GCA AAG AAG CTA AAA GCT AAG GGC TAT AAC TCT GGA TAC TAT TAC AGT GGA GAG GAT CCA 
2002 AKKLKAKG YNSGYYYSGEDP 

6455 GCC AAT CTG AGA GTT GTG ACA TCA CAA TCC CCC TAT GTA ATC GTG GCT ACA AAT GCT ATT 
2022 ANLRVVTS QS PYV IVATNAI 

6515 GAA TCA GGA GTG ACA CTA CCA GAT TTG GAC ACG GTT ATA GAC ACG GGG TTG AAA TGT GAA 
2042 ESGVTL PD LDTVI DTGLKCE 

6575 AAG AGG GTG AGG GTA TCA TCA AAG ATA CCC TTC ATC GTA ACA GGC CTT AAG AGG ATG GCC 
2062 KRVRVS S K I PFI VTG LKRHA 

6635 GTG ACT GTG GGT GAG CAG GCG CAG CGT AGG GGC AGA GTA GGT AGA GTG AAA CCC GGG AGC 
2082 VTVGEQAQ RRGRVGRVKPGR 

6695 TAT TAT AGG AGC CAG GAA ACA GCA ACA GGG TCA AAG GAC TAC CAC TAT GAC CTC TTG CAG 
2102 YYRSQETATGSKDYKYDLLQ 

6755 GCA CAA AGA TAC GGG ATT GAG GAT GGA ATC AAC GTG ACG AAA TCC TTT AGG GAG ATG AAT 
2122 AQRYGI EDGINVTKSFREMN 

6815 TAC GAT TGG AGC CTA TAC GAG GAG GAC AGC CTA CTA ATA ACC CAG CTG GAA ATA CTA AAT 
2142 YDWSL YEEDSLLITQLEILN 

6875 AAT CTA CTC ATC TCA GAA GAC TTG CCA GCC GCT GTT AAG AAC ATA ATG GCC AGG ACT GAT 
2162 NLLISEDLPAAVKNIMARTD 

6935 CAC CCA GAG CCA ATC CAA CTT GCA TAC AAC AGC TAT GAA GTC CAG GTC CCG GTC CTG TTC 
2182 HPEPIQLAYNSYEVQVPV'LF 



5314 




164 1 




5374 




1661 




5434 




1681 




5494 




1701 




5554 




1 *7"> 1 




5614 




1 1A 1 




5674 




1 *7C1 
1 /Dl 




5734 




1 781 




5794 




J. QUI 




5854 




1821 




5914 




1841 




5974 




1861 




6034 




1881 




6094 




19U1 




6154 






i 






6214 




1 OA 1 




6274 


D 






6334 




1981 




6394 




2001 




6454 




2021 




6514 




2041 




6574 




2061 




6634 




2081 




6694 




2101 




6754 




2121 




6814 




2141 




6874 




2161 




6934 




2181 




6994 




2201 
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6995 CCA AAA ATA ACG AAT CGA GAA GTC ACA GAC ACC TAC GAA AAT TAC TCG TTT CTA AAT GCC 7054 

2202 P K.I R N G E V T D T Y E N Y . S F L N A 2221 

7055 AGA AAG TTA GGG GAG GAT GTG CCC GTG TAT ATC TAC GCT ACT GAA GAT GAG GAT CTG GCA 7114 

2222 RKLGEDVPVYIYATEDEDLA 2241 

7115 GTT GAC CTC TTA GGG CTA GAC TGG CCT GAT CCT GGG AAC CAG CAG GTA GTG GAG ACT GGT 7174 

2242 VDLLGLDWPDPGNQQVVETG 2261 

7175 AAA GCA CTG AAG CAA GTG ACC GGG TTG TCC TCG GCT GAA AAT GCC CTA CTA GTG GCT TTA 7234 

2262 KALKQVTGLSSAENA L L V A L 2281 

7235 TTT GGG TAT GTG GGT TAC CAG GCT CTC TCA AAG AGG CAT GTC CCA ATG ATA ACA GAC ATA 7294 

2282 FGY VGYQALSKRHVPMITDI 2301 

7295 TAT ACC ATC GAG GAC CAG AGA CTA GAA GAC ACC ACC CAC CTC CAG TAT GCA CCC AAC GCC 7354 

2302 YTIEDQRLEDTTHLQYAPNA 2321 

7355 ATA AAA ACC GAT GGG ACA GAG ACT GAA CTG AAA GAA CTG GCG TCG GGT GAC GTG GAA AAA 7414 

2322 I KTDGTET E LK E LAS G DV E K 2341 

7415 ATC ATG GGA GCC ATT TCA GAT TAT GCA GCT GGG GGA CTG GAG TTT GTT AAA TCC CAA GCA 7474 

2342 IMGAISDYAAGGLEFVKSQA 2361 

7475 GAA AAG ATA AAA ACA GCT CCT TTG TTT AAA GAA AAC GCA GAA GCC GCA AAA GGG TAT GTC 7534 

2362 EKIKTAPLFKENAEAAKGYV 2381 

7535 CAA AAA TTC ATT GAC TCA TTA ATT GAA AAT AAA GAA GAA ATA ATC AGA TAT GGT TTG TGG 7594 

2382 QKFIDSLIENKEEIIRYGLW 2401 

7595 GGA ACA CAC ACA GCA CTA TAC AAA AGC ATA GCT GCA AGA CTG GGG CAT GAA ACA GCG TTT 7654 

2402 GTHTALYKSIAARLGHET AF 2421 

7655 GCC ACA CTA GTG TTA AAG TGG CTA GCT TTT GGA GGG GAA TCA GTG TCA GAC CAC GTC AAG 7714 

2422 ATLVLKWLAFGG ESVSDHVK 2441 

7715 CAG GCG GCA GTT GAT TTA GTG GTC TAT TAT GTG ATG AAT AAG CCT TCC TTC CCA GGT GAC 7774 

2442 QAAVDLVVYYVMNKPSFPGD 2461 

7775 TCC GAG ACA CAG CAA GAA GGG AGG CGA TTC GTC GCA AGC CTG TTC ATC TCC GCA CTG GCA 7834 

2462 SETQQEGRRFVASLF ISA LA 2481 

7835 ACC TAC ACA TAC AAA ACT TGG AAT TAC CAC AAT CTC TCT AAA GTG GTG GAA CCA GCC CTG 7894 

2482 TYTYKTWNYHNLSKVVEPAL 2501 



7895 GCT TAC CTC CCC TAT GCT ACC AGC GCA TTA AAA ATG TTC ACC CCA ACG CGG CTG GAG AGC 
2502 AYLPYATSALKMFTPTRLES 



7954 
2521 



7955 GTG GTG ATA CTG AGC ACC ACG ATA TAT AAA ACA TAC CTC TCT ATA AGG AAG GGG AAG AGT 8014 

2522 VVI LSTTIYKTYLSI RKGKS 2541 

8015 GAT GGA TTG CTG GGT ACG GGG ATA AGT GCA GCC ATG GAA ATC CTG TCA CAA AAC CCA GTA 8074 

2542 DGLLGTGISAAMEILSQNPV 2561 



8075 TCG GTA GGT ATA TCT GTG ATG TTG GGG GTA GGG GCA ATC GCT GCG CAC AAC GCT ATT GAG 
2562 SVGISVMLGVGAIAAHNAI E 



8134 
2581 



8135 TCC AGT GAA CAG AAA AGG ACC CTA CTT ATG AAG GTG TTT GTA AAG AAC TTC TTG GAT CAG 8194 

2582 SSEQKRTLLMKVFVK NFLDQ 2601 

8195 GCT GCA ACA GAT GAG CTG GTA AAA GAA AAC CCA GAA AAA ATT ATA ATG GCC TTA TTT GAA 8254 

2602 AATDELVKENP EK I I MA L F E 2621 

8255 GCA GTC CAG ACA ATT GGT AAC CCC CTG AGA CTA ATA TAC CAC CTG TAT GGG GTT TAC TAC 8314 

2622 AVQTIGNPLRLIYHLYGVYY 2641 

8315 AAA GGT TGG GAG GCC AAG GAA CTA TCT GAG AGG ACA GCA GGC AGA AAC TTA TTC ACA TTG 8374 

2642 KGWEAKELSERT AGRNLFTL 2661 

8375 ATA ATG TTT GAA GCC TTC GAG TTA TTA GGG ATG GAC TCA CAA GGG AAA ATA AGG AAC CTG 8434 

2662 IMFEAFELLGMDSOGKIRNL 2681 

8435 TCC GGA AAT TAC ATT TTG GAT TTG ATA TAC GGC CTA CAC AAG CAA ATC AAC AGA GGG CTG 8494 

2682 SGNYILDLIYGLHKQINRGL 2701 

8495 AAG AAA ATG GTA CTG GGG TGG GCC CCT GCA CCC TTT AGT TCT GAC TGG ACC CCT AGT GAC 8554 

2702 KKMVLGWAPAPFSCDWTPSD 2721 

8555 GAG AGG ATC AGA TTG CCA ACA GAC AAC TAT TTG AGG GTA GAA ACC AGG TCC CCA TGT GGC 8614 

2722 ERIRLPTDNYLRVETRCPCG 2741 

8615 TAT GAG ATG AAA GCT TTC AAA AAT GTA GGT GGC AAA CTT ACC AAA GTG GAG GAG AGC GGG 8674 

2742 YEMKAFKNVGGKLTKVEESG 2761 

8675 CCT TTC CTA TGT AGA AAC AGA CCT GGT AGG GGA CCA GTC AAC TAC AGA GTC ACC AAG TAT 8734 

2762 PFLCRNRPGRGPVNYRVTKY 2781 
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8735 TAC GAT GAC AAC CTC AGA GAG ATA AAA CCA GTA GCA AAG TTC GAA GCA CAG GTA GAG CAC 8791 

2782 YDDNLREIKPVAKLECQVEH 2801 

8795 TAC TAC AAA GGG CTC ACA GCA AAA ATT GAC TAC ACT AAA GGA AAA ATC CTC TTG GCC ACT 8854 

2802 YYKGVT AK I DYSKGKM LLAT 2821 

8855 GAC AAG TGG GAG GTG GAA CAT GOT GTC ATA ACC AGG TTA GCT AAG AGA TAT ACT GGG CTC 8914 

2822 DKWEVEHGVITRLAKRYTCV 2841 

8915 GGG TTC AAT GGT GCA TAC TTA GCT GAC GAG CCC AAT CAC CGT GCT CTA GTG GAG AGG GAC 8974 

2842 GFNGAYLGDEPNHRALVERD 2861 

8975 TGT GCA ACT ATA ACC AAA AAC ACA GTA CAG TTT CTA AAA ATG AAG AAG GGG TGT GCG TTC 9034 

2862 CATITKNTVQFLKMKKGCAF 2881 

9035 ACC TAT GAC CTC ACC ATC TCC AAT CTG ACC AGG CTC ATC GAA CTA GTA CAC AGG AAC AAT 9094 

2882 TYDLTISNLTRLIELVHRNN 2901 

9095 CTT GAA GAG AAG GAA ATA CCC ACC GCT ACG GTC ACC ACA TGG CTA GCT TAC ACC TTC GTG 9154 

2902 LEEKEIPTATVTTWLAYTFV 2921 

9155 AAT GAA GAC GTA GGG ACT ATA AAA CCA GTA CTA GGA GAG AGA GTA ATC CCC GAC CCT GTA 9214 

2922 NEDVGTIKPVLGE.RVI PDPV 2941 

9215 GTT GAT ATC AAT TTA CAA CCA GAG GTG CAA GTG GAC ACG TCA GAG GTT GGG ATC ACA ATA 9274 

2942 VDINLQPEVQVDTSEVGITI 2961 

9275 ATT GGA AGG GAA ACC CTG ATG ACA ACG GGA GTG ACA CCT GTC TTG GAA AAA GTA GAG CCT 9334 

2962 IGRETLMTTGVTPVLEKVEP 2981 

9335 GAC GCC AGC GAC AAC CAA AAC TOG GTG AAG ATC GGG TTG GAT GAG GGT AAT TAC CCA GGG 9394 

2982 DASDNQNSVKIG LDEGNYPG 3001 

9395 CCT GGA ATA CAG ACA CAT ACA CTA ACA GAA GAA ATA CAC AAC AGG GAT GCG AGG CCC TTC 9454 

3002 PGIQTHTLTEEI.HNRDARPF 3021 

9455 ATC ATG ATC CTG GCC TCA AGG AAT TCC ATA TCA AAT AGG GCA AAG ACT GCT AGA AAT ATA 9514 

3022 IMI L.GSRNS ISNRAKTARNI 3041 

9515 AAT CTG TAC ACA GGA AAT GAC CCC AGG GAA ATA CGA GAC TTG ATG GCT GCA GGG CGC ATG 9574 

3042 NLYTGNDPREIRDLMAAGRM 3061 I 

9575 TTA GTA GTA GCA CTG AGG GAT GTC GAC CCT GAG CTG TCT GAA ATG GTC GAT TTC AAG GGG 9634 ^ 

3062 LVVALRDVDPELSEMVDFKG 3081 H 

9635 ACT TTT TTA GAT AGG GAG GCC CTG GAG GCT CTA ACT CTC GGG CAA CCT AAA CCG AAG CAG 9694 5 

3082 TFLDREALEALS LGQPKPKQ 3101 *~ 

O 

9695 GTT ACC AAG GAA GCT GTT AGG AAT TTG ATA GAA CAG AAA AAA GAT GTG GAG ATC CCT AAC 9754 

3102 VTKEAVRNLIEQKKDVEIPN 3121 & 

9755 TGG TTT GCA TCA GAT GAC CCA GTA TTT CTG GAA GTG GCC TTA AAA AAT GAT AAG TAC TAC 9814 

3122 WFASDDPVFLEVALKNDKYY 3141 

9815 TTA GTA GGA GAT GTT GGA GAG CTA AAA GAT CAA GCT AAA GCA CTT GGG GCC ACG GAT CAG 9874 

3142 LVGDVGELKDQAKALGATDQ 3161 

9875 ACA AGA ATT ATA AAG GAG GTA GGC TCA AGG ACG TAT GCC ATG AAG CTA TCT AGC TGG TTC 9934 

3162 TRIIKEVGSRTYAMKLSS WF 3181 



9935 CTC AAG GCA TCA AAC AAA CAG ATG AGT TTA ACT CCA CTG TTT GAG GAA TTG TTG CTA CCG 9994 

3182 LKASNKQMSLTPL FEELLLR 3201 

9995 TGC CCA CCT GCA ACT AAG AGC AAT AAG GGG CAC ATG GCA TCA GCT TAC CAA TTG GCA CAG 10054 

3202 CPPATKSNKGHMASAYQLAQ 3221 

10055 GGT AAC TGG GAG CCC CTC GGT TGC GGG GTG CAC CTA GGT ACA ATA CCA GCC AGA AGG GTG 10114 

3222 GNWEPLGCGVHL GTI PARRV 3241 

10115 AAG ATA CAC CCA TAT GAA GCT TAC CTG AAG TTG AAA GAT TTC ATA GAA GAA GAA GAG AAG 10174 

3242 KIH PYEAY LKLKDFI E EEEK 3261 

10175 AAA CCT AGG GTT AAG GAT ACA GTA ATA AGA GAG CAC AAC AAA TGG ATA CTT AAA AAA ATA 10234 

3262 KPRVKDTVI REHNKWI LKKI 3281 

10235 AGG TTT CAA GGA AAC CTC AAC ACC AAG AAA ATG CTC AAC CCG GGG AAA CTA TCT GAA CAG 10294 

3282 RFQGNLNTKKMLNPGKLSEQ 3301 

10295 TTG GAC AGG GAG GGG CGC AAG AGG AAC ATC TAC AAC CAC CAG ATT GGT ACT ATA ATG TCA 10354 

3302 LDREGRKRNIYNHQIGTIMS 3321 

10355 AGT GCA GGC ATA AGG CTG GAG AAA TTG CCA ATA GTG AGG GCC CAA ACC GAC ACC AAA ACC 10414 

3322 SAG IRLEKLPI VRAQTDTKT 3341 

10415 TTT CAT GAG GCA ATA AGA GAT AAG ATA GAC AAG AGT GAA AAC CGC CAA AAT CCA GAA TTG 10474 

3342 FHEAIRDKIDKSENRQNPEL 3361 
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10475 CAC AAC AAA TTG TTG GAG ATT TTC CAC ACG ATA GCC CAA CCC ACC CTG AAA CAC ACC TAC 10534 

3362 HNK LLEI FHTIAQPTLKHTV 3381 

10535 GGT GAG CTG ACG TGG GAG CAA CTT GAG GCG GGG ATA AAT AGA AAG GGG GCA GCA GGC TTC 10594 

3382 GEVTWE'QLEAGINRKGAAGF 3401 



10595 CTG GAG AAG AAG AAC ATC GGA GAA GTA TTG GAT TCA GAA AAG CAC CTG GTA GAA CAA TTG 
3402 LEKKN IGEVLDSEKHLVEQL 



10654 
3421 



10655 GTC AGG GAT CTG AAG GCC GGG AGA AAG ATA AAA TAT TAT GAA ACT GCA ATA CCA AAA AAT 
3422 VRDLKAGRKIKYYETAI PKN 



10714 
3441 



10715 GAG AAG AGA GAT GTC AGT GAT GAC TGG CAG GCA GGG GAC CTG GTG GTT GAG AAG AGG CCA 
3442 EKRDVSDDWQAGDLVVEKRP 



10774 
3461 



10775 AGA GTT ATC CAA TAC CCT GAA GCC AAG ACA AGG CTA GCC ATC ACT AAG GTC ATG TAT AAC 
3462 RVI QYPEAKTRLAITK VMYN 



10834 
3481 



10835 TGG GTG AAA CAG CAG CCC GTT GTG ATT CCA GGA TAT GAA GGA AAG ACC CCC TTG TTC AAC 10894 

3482 WVKQQPVVI PGYEGKTP L F N 3501 

10895 ATC TTT GAT AAA GTG AGA AAG GAA TGG GAC TCG TTC AAT GAG CCA GTG GCC GTA AGT TTT 10954 

3502 IFDKVRKEWDSFNEPVAVSF 3521 

10955 GAC ACC AAA GCC TGG GAC ACT CAA GTG ACT AGT AAG GAT CTG CAA CTT ATT GGA GAA ATC 11014 

3522 DTKAWDTQVTSKDLQLIG EI 3541 

11015 CAG AAA TAT TAC TAT AAG AAG GAG TGG CAC AAG TTC ATT GAC ACC ATC ACC GAC CAC ATG 11074 

3542 QKYYYKKEWHKFIDTITDHM 3561 

11075 ACA GAA GTA CCA GTT ATA ACA GCA GAT GGT GAA GTA TAT ATA AGA AAT GGG CAG AGA GGG 11134 

3562 TEV PVITADGEVYIRNGQRG 3581 



11135 AGC GGC CAG CCA GAC ACA AGT GCT GGC AAC AGC ATG TTA AAT GTC CTG ACA ATG ATG TAC 
3582 SGQ PDTSAGNSM.LNVLTMMY 



11194 
3601 



11195 GGC TTC TGC GAA AGC ACA GGG GTA CCG TAC AAG AGT TTC AAC AGG GTG GCA AGG ATC CAC 11254 

3602 GFC ESTGVPYKSFNRVAR IH 3621 

11255 GTC TGT GGG GAT GAT GGC TTC TTA ATA ACT GAA AAA GGG TTA GGG CTG AAA TTT GCT AAC 11314 

3622 VCGDDGFLITEKGLGLKFAN 3641 



I 



11315 AAA GGG ATG CAG ATT CTT CAT GAA GCA GGC AAA CCT CAG AAG ATA ACG GAA GGG GAA AAG 11374 

3642 KGMQI LHEAGKPQKITEG EK 3661 

11375 ATG AAA GTT GCC TAT AGA TTT GAG GAT ATA GAG TTC TGT TCT CAT ACC CCA GTC CCT GTT 11434 

3662 MKVAYRFEDIEFCSHTPVPV 3681 

11435 AGG TGG TCC GAC AAC ACC AGT AGT CAC ATG GCC GGG AGA GAC ACC GCT GTG ATA CTA TCA 11494 

3682 RWS DNTSSHMAGRDTAVI LS 3701 



11495 AAG ATG GCA ACA AGA TTG GAT TCA AGT GGA GAG AGG GGT ACC ACA GCA TAT GAA AAA GCG 
3702 KMATRLDSSGERGTTAYEKA 



11554 
3721 



11555 GTA GCC TTC AGT TTC TTG CTG ATG TAT TCC TGG AAC CCG CTT GTT AGG AGG ATT TGC CTG 
3722 VAFSFLLMYSWNPLVRRI CL 



11614 
3741 



11615 TTG GTC CTT TCG CAA CAG CCA GAG ACA GAC CCA TCA AAA CAT GCC ACT TAT TAT TAC AAA 
3742 LVLSQQPETDPSKHATYYYK 



11674 
3761 



11675 GGT GAT CCA ATA GGG GCC TAT AAA GAT GTA ATA GGT COG AAT CTA AGT GAA CTG AAG AGA 
3762 GDP IGAYKDVIGRNLSELKR 



11734 
3781 



11735 ACA GGC TTT GAG AAA TTG GCA AAT CTA AAC CTA AGC CTG TCC ACG TTG GGG ATC TGG ACT 
3782 TGF EKLANLNLSLSTLGIWT 



11794 
3801 



11795 AAG CAC ACA AGC AAA AGA ATA ATT CAG GAC TGT GTT GCC ATT GGG AAA GAA GAG GGC AAC 
3802 KHTSKRI IQDCVAIGKEEGN 



11854 
3821 



11B55 TGG CTA GTT AAC GCC GAC AGG CTG ATA TCC AGC AAA ACT GGC CAC TTA TAC ATA CCT GAT 
3822 WLVNADRLI SSKTGHLYI PD 



11914 
3841 



11915 AAA GGC TTT ACA TTA CAA GGA AAG CAT TAT GAG CAA CTG CAG CTA AGA ACA GAG ACA AAC 11974 
3842 KGFTLQGKHYEQLQLRTETN 3861 

11975 CCG GTC ATG GGG GTT GGG ACT GAG AGA TAC AAG TTA GGT CCC ATA GTC AAT CTG CTG CTG 12034 
3862 PVMGVGTERYKLGPIVNLLL 3881 

12035 AGA AGG TTG AAA ATT CTG CTC ATG ACG GCC GTC GGC GTC AGC AGC TGA gacaaaatgtatatat 12098 
3882 RRLKI LLMTAVGVSS* 3897 



12099 tgtaaacaaattaatccatgtacatagtgtatataaataeagttgggaccgtccacctcaagaagacgacacgcccaaca 12178 
12179 cgcacagctaaacagtagtcaagaccacctacctcaagataacaccacattcaacgcacacagcactttagctgtatgag 12258 
12259 gacacgcccgacgcccacagtcggaccagggaagacctccaacagccccc 12308 



WO 99/55366 



PCT7US99/08850 



37/67 



GTATaatcactcccctgtgaggaactactgtcttcacgcagaaagcgtctagccatggcgttagtatgaglgtcgtgcagcctccag 
gaccccccctcccgggagagccatagtggtctgcggaaccggtgagtacaccggaattgccaggacgaccgggtcctttcttggata 
aacccgctcaatgcctggagatttgggcgtgcccccgcaagactgctagccgagtagtgttgggtcgcgaaaggccttgtggtactgc 
ctgatagggtgcttgcgagtgccccgggaggtctcgtagaccgtgcaccATG 



FIGURE 13 
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GTaatcactcccctgtgaggaactactgtcttcacgcagaaagcgtctagccaiggcgttagtatgagtgtcgtgcagcctccaggac 
cccccctcccgggagagccatagtggtctgcggaaccggtgagtacaccggaattgccaggacgaccgggtcctttcttggataaac 
ccgctcaatgcctggagatttgggcgtgcccccgcaagactgctagccgagtagtgttgggtcgcgaaaggccttgtggtactgcctg 
atagggtgcttgcgagtgccccgggaggtctcgtagaccgtgcaccATG 



FIGURE 14 
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GTATacactccaccatgaatcactcccctgtgaggaactactgtcttcacgcagaaagcgtctagccatggcgtlagtatgagtgtcg 
tgcagcctccaggaccccccctcccgggagagccatagtggtctgcggaaccggtgagtacaccggaattgccaggacgaccggg 
tccntcttggataaacccgctcaatgcciggagacttgggcgtgcccccgcaagactgctagccgagtagtgttgggtcgcgaaaggc 
cttgtggtactgcctgatagggtgcttgcgagtgccccgggaggtctcgtagaccgtgcaccATG 



FIGURE 15 
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GTATCAGAAGTGCGAATGCTGAacactccaccatgaatcactcccctgtgaggaactactgtcttcacgcagaaa 

gcgtctagccatggcgttagtatgagtgtcgtgcagcctccaggaccccccctcccgggagagccatagtggtctgcggaaccggtg 

agtacaccggaattgccaggacgaccgggtcctttcttggataaacccgctcaatgcctggagatttgggcgtgcccccgcaagactg 

ctagccgagtagtgttgggtcgcgaaaggccttgtggtactgcctgatagggtgcttgcgagtgccccgggaggtctcgtagaccgtg 

caccATG 



FIGURE 16 
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GTATgccagccccctgatgggggcgacactccaccatgaatcactcccctgtgaggaactactgtcttcacgcagaaagcgtctag 
ccatggcgttagtatgagtgtcgtgcagcctccaggaccccccctcccgggagagccatagtggtctgcggaaccggtgagtacacc 
ggaattgccaggacgaccgggtcctttcttggataaacccgctcaatgcctggagatttgggcgtgcccccgcaagactgctagccga 
gtagtgttgggtcgcgaaaggccttgtggtactgcctgatagggtgcttgcgagtgccccgggaggtctcgtagaccgtgcaccAT 

G 



FIGURE 17 
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GTATTGCAGTITgccagccccctgatgggggcgacactccaccatgaaccactcccctgtgaggaactactgtcttcacgc 

agaaagcgtctagccatggcgttagtatgagtgtcgtgcagcctccaggaccccccctcccgggagagccatagtggtctgcggaac 

cggtgagtacaccggaattgccaggacgaccgggtcctttcttggataaacccgctcaatgcctggagatttgggcgtgcccccgcaa 

gactgctagccgagtagtgttgggtcgcgaaaggccttgtggtactgcctgatagggtgctcgcgagtgccccgggaggtctcgtaga 

ccgtgcaccATG 



FIGURE 18 
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GTATTGCAGTTTgccagccccctgatgggggcgacactccaccatgaatcactcccctgtgaggaactactgtcttcacgc 

agaaagcgtctagccatggcgttagtatgagtgtcgtgcagcctccaggaccccccctcccgggagagccatagtggtctgcggaac 

cggtgagtacaccggaattgccaggacgaccgggtcctttcttggataaacccgctcaatgcctggagatttgggcgtgcccccgcaa 

gactgctagccgagtagtgttgggtcgcgaaaggccttgtggtac^cctgatagggtgcttgcgagtgccccgggaggtctcgtaga 

ccgtg<^cATGGAGTTGATCACAAATGAACTITTATACAAAACATACAAACAAAAAC 

CCGTCGGGGTGGAGGAACCTGTTTATGATCAGGCAGGTGATCCCTTATTTGGT 

GAAAGGGGAGCAGTCCACCCTCAATCGACGCTAAAGCTCCCACACAAGAGAG 

GGGAACGCGATGTTCCAACCAACTTGGCATCCTTACCAAAAAGAGGTGACTGC 

AGGTCGGGTAATAGCAGAGGACCTGTGAGCGGGATCTACCTGAAGCCAGGGC 

CACTATTTTACCAGGACTATAAAGGTCCCGTCTATCACAGGGCCCCGCTGGAGC 

TCnTGAGGAGGGATCCATGTGTGAAACGACTAAACGGATAGGGAGAGTAACT 

GGAAGTGACGGAAAGCTGTACCACATTTATGTGTGTATAGATGGATGTATAATA 

ATAAAAAGTGCCACGAGAAGTTACCAAAGGGTGTTCAGGTGGGTCCATAATAG 

GCTTGACTGCCCTCTATGGGTCACAACTTGCTCAGACACGAAAGAAGAGGGAG 

CAACAAAAAAGAAAACACAGAAACCCGACAGACTAGAAAGGGGGAAAATGAA 

AATAGTGCCCAAAGAATCTGAAAAAGACAGCAAAACTAAACCTCCGGATGCTA 

CAATAGTGGTGGAAGGAGTCAAATACCAGGTGAGGAAGAAGGGAAAAACCAA 

GAGTAAAAACACTCAGGACGGCTTGTACCATAACAAAAACAAACCTCAGGAAT 

CACGCAAGAAACTGGAAAAAGCATTGTTGGCGTGGGCAATAATAGCTATAGTT 

TTGTTTCAAGTTACAATGGGAGAAAACATAACACAGTGGAACCTACAAGATAAT 

GGGACGGAAGGGATACAACGGGCAATGTTCCAAAGGGGTGTGAATAGAAGTT 

TACATGGAATCTGGCCAGAGAAAATCTGTACTGGTGTCCCTTCCCATCTAGCCA 

CCGATATAGAACTAAAAACAATTCATGGTATGATGGATGCAAGTGAGAAGACC 

AACTACACGTGTTGCAGACTTCAACGCCATGAGTGGAACAAGCATGGTTGGTG 

CAACTGGTACAATATTGAACCCTGGATTCTAGTCATGAATAGAACCCAAGCCAA 

TCTCACTGAGGGACAACCACCAAGGGAGTGCGCAGTCACTTGTAGGTATGATA 

GGGCTAGTGACTTAAACGTGGTAACACAA GCTA GAGATAGCCCCACACCCTTA 

ACAGGTTGCAAGAAAGGAAAGAACTTCTCCTTTGCAGGCATATTGATGCGGGG 

CCCCTGCAACrrTGAAATAGCTGCAAGTGATGTATrATTCAAAGAACATGAACG 

CATTAGTATGTTCCAGGATACTACTCTTTACCTTGTTGACGGGTTGACCAACTCC 

TTAGAAGGTGCCAGACAAGGAACCGCTAAACTGACAACCTGGTTAGGCAAGCA 

GCTCGGGATACTAGGAAAAAAGTTGGAAAACAAGAGTAAGACGTGGTTTGGAG 

CATACGCTGCTTCCCCITACTGTGATGTCGATCGCAAAATTGGCTACATATGGT 

ATACAAAAAATTGCACCCCTGCCTGCTTACCCAAGAACACAAAAATTGTCGGCC 

CTGGGAAATTTGACACCAATGCAGAGGACGGCAAGATATTACATGAGATGGGG 

GGTCACITGTCGGAGGTACTACTACl'rrCI'l'rAGT GGTGC TGTCCGACTTCGCA 

CCGGAAACAGCTAGTGTAATGTACCTAATCCTACATTTTTCCATCCCACAAAGTC 

ACGTTGATGTAATGGATTGTGATAAGACCCAGTTGAACCTCACAGTGGAGCTG 

ACAACAGCTGAAGTAATACCAGGGTCGGTCTGGAATCTAGGCAAATATGTATG 

TATAAGACCAAATTGGTGGCCTTATGAGACAACTGTAGTGTTGGCATTTGAAGA 

GGTGAGCCAGGTGGTGAAGTTAGTGTTGAGGGCACTCAGAGATTTAACACGCA 

TTTGG AACGCTGC AAC AACT ACTGC 1 " 1"! " I'l ' 1 AGT ATGCCTTGTT AAG AT AGTC AG 

GGGCCAGATGGTACAGGGCATTCTGTGGCTACTATTGATAACAGGGGTACAAG 

GGCACTTGGATTGCAAACCTGAATTCTCGTATGCCATAGCAAAGGACGAAAGA 

ATTGGTCAACTGGGGGCTGAAGGCCTTACCACCACTTGGAAGGAATACTCACC 

TGGAATGAAGCTGGAAGACACAATGGTCATTGCTTGGTGCGAAGATGGGAAGT 

TAATGTACCTCCAAAGATGCACGAGAGAAACCAGGTA TCTC GCAATCTTGCATA 

CAAGAGCCITGCCGACCAGTGTGGTATTCAAAAAACTCTTTGATGGGCGAAAG 
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CAAGAGGATGTAGTCGAAATGAACGACAACTTTGAATTTGGACTCTGCCCATGT 

GATGCCAAACCCATAGTAAGAGGGAAGTTCAATACAACGCTGCTGAACGGACC 

GGCCTTCCAGATGGTATGCCCCATAGGATGGACAGGGACTGTAAGCTGTACGT 

CATTCAATATGGACACCTTAGCCACAACTGTGGTACGGACATATAGAAGGTCTA 

AACCATTCCCTCATAGGCAAGGCTGTATCACCCAAAAGAATCTGGGGGAGGAT 

CTCCATAACTGCATCCTTGGAGGAAATTGGACTTGTGTGCCTGGAGACCAACTA 

CTATACAAAGGGGGCTCTATTGAATCTTGCAAGTGGTGTGGCTATCAATTTAAA 

GAGAGTGAGGGACTACCACACTACCCCATTGGCAAGTGTAAATTGGAGAACGA 

GACTCKjTTACAGGCTAGTAGACAGTACCTCTTGCAATAGAGAAGGTGTGGCCA 

TAGTACCACAAGGGACATTAAAGTGCAAGATAGGAAAAACAACTGTACAGGTC 

ATAGCTATOGATACCAAACTCGGACCTATGCCTTGCAGACCATATGAAATCATA 

TCAAGTGAGGGGCC TGTAG AAAAGACAGCGTGTAC TTTCA ACTACACTAAGAC 

ATTAAAAAATAAGTATTTTGAGCCCAGAGACAGCTACrTTCAGCAATACATGCT 

AAAAGGAGAGTATCAATACTGGTTTGACCTGGAGGTGACTGACCATCACCGGG 

ATTACTTCGCTGAGTCCATATTAGTGGTGGTAGTAGCCCTCTTGGGTGGCAGAT 

ATGTACTTTGGTTACTGGTTACATACATGGTCTTATCAGAACAGAAGGCCTTAG 

GGATTCAGTATGGATCAGGGGAAGTGGTGATGATGGGCAACTTGCTAACCCAT 

AACAATATTGAAGTGGTGACATACITCTTGCTGCTGTACCTACTGCTGAGGGAG 

GAGAGCGTAAAGAAGTGGGTCTTACTCTTATACCACATCTTAGTGGTACACCCA 

ATCAAATCTGTAATTGTGATCCTACTGATGATTGGGGA TGTGG TAAAGGCCGAT 

TCAGGGGGCCAAGAGTACTrGGGGAAAATAGACCTCTGTnTACAACAGTAGT 

ACTAATCGTCATAGGTTTAATCATAGCCAGGCGTGACCCAACTATAGTGCCACT 

GGTAACAATAATGGCAGCACTGAGGGTCACTGAACTGACCCACCAGCCTGGAG 

TTGACATCGCTGTGGCGGTCATGACTATAACCCTACTGATGGTTAGCTATGTGA 

CAGATTATTTTAGATATAAAAAATGGTTACAGTGCATTCTCAGCCTGGTATCTGC 

GGTGTTCTTGATAAGAAGCCTAATATA CCTA GGTAGAATCGAGATGCCAGAGG 

TAACTATCCCAAACTGGAGACCACTAACTTTAATACTATTATATTTGATCTCAAC 

AACAATTGTAACGAGGTGGAAGGTTGACGTGGCTGGCCTATTGTTGCAATGTG 

TGCCTATCTTATTGCTGGTCACAACCTTGTGGGCCGACTTCTTAACCCTAATACT 

GATCCTGCCTACCTATGAATTGGTTAAATTATACTATCTGAAAACTGTTAGGACT 

GATATAGAAAGAAGTTGGCTAGGGGGGATAGACT ATACAA GAGTTGACTCCAT 

CTACGACGTTGATGAGAGTGGAGAGGGCGTATATCTTTTTCCATCAAGGCAGA 

AAGC AC AGGGGAA' 1 ' 1 ' 1 ' 1 ' 1 CT AT ACTCTTGCCCCTT ATC AAAGC AA CACTG ATAA 

GTTGCGTCAGCAGTAAATGGCAGCTAATATACATGAGTTACTTAACTTTGGACT 

TTATGTACTACATGCACAGGAAAGTTATAGAAGAGATCTCAGGAGGTACCAACA 

TAATATCCAGGTTAGTGGCAGCAC TCATA GAGCTGAACTGGTCCATGGAAGAA 

GAGGAGAGCAAAGGCITAAAGAAGTTTTATCTATTGTCTGGAAGGTTGAGAAA 

CCTAATAATAAAACATAAGGTAAGGAATGAGACCGTGGCTTCTTGGTACGGGG 

AGGAGGAAGTCTACGGTATGCCAAAGATCATGACTATAATCAAGGCCAGTACA 

CTGAGTAAGAGCAGGCACTGCATAATATGCACTGTATGTGAGGGCCGAGAGTG 

GAAAGGTGGCACCTGCCCAAAATGTGGACGCCATGGGAAGCCG ATAA CGTGT 

GGGATGTCGCTAGCAGATTTTGAAGAAAGACACTATAAAAGAATCTTTATAAGG 

GAAGGCAACTTTGAGGGTATGTGCAGCCGATGCCAGGGAAAGCATAGGAGGT 

TTGAAATGGACCGGGAACCTAAGAGTGCCAGATACTGTGCTGAGTGTAATAGG 

CTGCATCCTGCTGACKjAAGGTGACTTTTGGGCAGAGTCGAGCATGTTGGGCCT 

CAAAATCACCTACTTTGCGCTGATGGATGGAAAGGTGTATGATATCACAGAGTG 

GGCTGGATGCCAGCGTGTGGGAATCTCCCCAGATACCCACAGAGT CCCT TGTC 

ACATCTCATTTGGTTCACGGATGCCTTTCAGGCAGGAATACAATCKjCTTTGTAC 

AATATACCGCTAGGGGGCAACTATTTCTGAGAAACTTGCCCGTACTGGCAACTA 

aagtaaaaatgctcatggtaggcaaccttggagaagaaattggtaatctggaa 
catcttgggtggatcctaagggggcctgccgtgtgtaag aagat cacagagca 

CGAAAAATGCCAC ATT AATATACTGG AT AAACTAACCGCA'rr n TCGGGATC AT 

gccaagggggactacacccagagccccggtgaggttccctacgagcttactaa 
aagtgaggaggggtctggagactgcctgggcttacacacaccaaggcgggat 
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AAGTTCAGTCGACCATGTAACCGCCGGAAAAGATCTACTGGTCTGTGACAGCA 

TGGGACGAACTAGAGTGGTTTGCCAAAGCAACAACAGGTTGACCGATGAGACA 

GAGTATGGCGTCAAGACTGACTCAGGGTGCCCAGACGGTGCCAGATGTTATGT 

GTTAAATCCAGAGGCCGTTAACATATCAGGATCCAAAGGGGCAGTCGTTCACC 

TCCAAAAGACAGGTGGAGAATTCACGTGTGTCACCGCATCAGGCAGACCGGCT 

TTCITCGACCTAAAAAACTTGAAAGGATGGTCAGGCTTGCCTATATTTGAAGCC 

TCCAGCGGGAGGGTGGTTGGCAGAGTCAAAGTAGGGAAGAATGAAGAGTCTA 

AACCTACAAAAATAATGAGTGGAATCCAGACCGTCTCAAAAAACAGAGCAGAC 

CTGACCGAGATGGTCAAGAAGATAACCAGCATGAACAGGGGAGACTTCAAGCA 

GATTACTTTGGCAACAGGGGCAGGCAAAACCACAGAACTCCCAAAAGCAGTTA 

TAGAGGAGATAGGAAGACACAAGAGAGTATTAGTTCTTATACCATTAAGGGCA 

GCGGCAGAGTCAGTCTACCAGTATATGAGATTGAAACACCCAAGCATCTCTTTT 

AACCTAAGGATAGGGGACATGAAAGAGGGGGACATGGCAACCGGGATAACCT 

ATGCATCATACGGGTACTTCTGCCAAATGCCTCAACCAAAGCTCAGAGCTGCTA 

TGGTAGAATACTCATACATATTCTTAGATGAATACCATTGTGCCACTCCTGAACA 

ACTGGCAATTATCGGGAAGATCCACAGATTTTCAGAGAGTATAAGGGTTGTCG 

CCATGACTGCCACGCCAGCAGGGTCGGTGACCACAACAGGTCAAAAGCACCCA 

ATAGAGGAATTCATAGCCCCCGAGGTAATGAAAGGGGAGGATCTTGGTAGTCA 

GTTCCTTGATATAGCAGGGTTAAAAATACCAGTGGATGAGATGAAAGGCAATAT 

GTTGGTTTTTGTACCAACGAGAAACATGGCAGTAGAGGTAGCAAAGAAGCTAA 

AAGCTAAGGGCTATAACTCTGGATACTATTACAGTGGAGAGGATCCAGCCAAT 

CTGAGAGTTGTGACATCACAATCCCCCTATGTAATCGTGGCTACAAATGCTATT 

GAATCAGGAGTGACACTACCAGATTTGGACACGGTTATAGACACGGGGTTGAA 

ATGTGAAAAGAGGGTGAGGGTATCATCAAAGATACCCTTCATCGTAACAGGCC 

TTAAGAGGATGGCCGTGACTGTGGGTGAGCAGGCGCAGCGTAGGGGCAGAGT 

AGGTAGAGTGAAACCCGGGAGGTATTATAGGAGCCAGGAAACAGCAACAGGG 

TCAAAGGACTACCACTATGACCTCTTGCAGGCACAAAGATACGGGATTGAGGA 

TGGAATCAACGTGACGAAATCCTTTAGGGAGATGAATTACGATTGGAGCCTATA 

CGAGGAGGACAGCCTACTAATAACCCAGCTGGAAATACTAAATAATCTACTCAT 

CTCAGAAGACTTGCCAGCCGCTGTTAAGAACATAATGGCCAGGACTGATCACC 

CAGAGCCAATCCAACTTGCATACAACAGCTATGAAGTCCAGGTCCCGGTCCTGT 

TCCCAAAAATAAGGAATGGAGAAGTCACAGACACCTACGAAAATTACTCGTTTC 

TAAATGCCAGAAAGTTAGGGGAGGATGTGCCCGTGTATATCTACGCTACTGAA 

GATGAGGATCTGGCAGTTGACCTCTTAGGGCTAGACTGGCCTGATCCTGGGAA 

CCAGCAGGTAGTGGAGACTGGTAAAGCACTGAAGCAAGTGACCGGGTTGTCCT 

CGGCTGAAAATGCCCTACTAGTGGCTTTATTTGGGTATGTGGGTTACCAGGCTC 

TCTCAAAGAGGCATGTCCCAATGATAACAGACATATATACCATCGAGGACCAGA 

GACTAGAAGACACCACCCACCTCCAGTATGCACCCAACGCCATAAAAACCGAT 

GGGACAGAGACTGAACTGAAAGAACTGGCXjTCGGGTGACGTGGAAAAAATCA 

tgggagccatttcagattatgcagctgggggactggagtttgttaaatcccaa 
gcagaaaagataaaaacagctcctttgtitaaagaaaacgcagaagccgcaaa 
agggtatgtccaaaaattcattgactcattaattgaaaataaagaagaaataat 

CAGATATGGTTTGTGGGGAACACACACAGCACTATACAAAAGCATAGCTGCAA 

GACTGGGGCATGAAACAGCGTTTGCCACACTAGTGTTAAAGTGGCTAGCTTTT 

GGAGGGGAATCAGTGTCAGACCACGTCAAGCAGGCGGCAGTTGATTTAGTGG 

TCTATTATGTGATGAATAAGCCTTCCTTCCCAGGTGACTCCGAGACACAGCAAG 

AAGGGAGGCGATTCGTCGCAAGCCTGTTCATCTCCGCACTGGCAACCTACACA 

TACAAAACTTGGAATTACCACAATCTCTCTAAAGTGGTGGAACCAGCCCTGGCT 

TACCTCCCCTATGCTACCAGCGCATTAAAAATGTTCACCCCAACGCGGCTGGAG 

AGCGTGGTGATACTGAGCACCACGATATATAAAACATACCTCTCTATAAGGAAG 

GGGAAGAGTGATGGATTGCTGGGTACGGGGATAAGTGCAGCCATGGAAATCC 

TGTCACAAAACCCAGTATCGGTAGGTATATCTGTGATGTTGGGGGTAGGGGCA 

atcgctgcgcacaacgctattgagtccagtgaacagaaaaggaccctacttat 

GAAGGTGTTTGTAAAGAACTTCTTGGATCAGGCTGCAACAGATGAGCTGGTAA 
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AAGAAAACCCAGAAAAAATTATAATGGCCTTATTTGAAGCAGTCCAGACAATTG 

GTAACCCCCTGAGACTAATATACCACCTGTATGGGGTTTACTACAAAGGTTGGG 

AGGCCAAGGAACTATCTGAGAGGACAGCAGGCAGAAACTTATTCACATTGATA 

ATGTTTGAAGCCTTCGAGTTATTAGGGATGGACTCACAAGGGAAAATAAGGAA 

CCTGTCCGGAAATTACATTTTGGATTTGATATACGGCCTACACAAGCAAATCAA 

CAGAGGGCTGAAGAAAATGGTACTGGGGTGGGCCCCTGCACCCTTTAGTTGTG 

ACTGGACCCCTAGTGACGAGAGGATCAGATTGCCAACAGACAACTATTTGAGG 

GTAGAAACCAGGTGCCCATGTGGCTATGAGATGAAAGCTTTCAAAAATGTAGG 

TGGCAAACTTACCAAAGTGGAGGAGAGCGGGCCTTTCCTATGTAGAAACAGAC 

CTGGTAGGGGACCAGTCAACTACAGAGTCACCAAGTATTACGATGACAACCTC 

AGAGAGATAAAACCAGTAGCAAAGTrGGAAGGACAGGTAGAGCACTACTACAA 

AGGGGTCACAGCAAAAATTGACTACAGTAAAGGAAAAATGCTCITGGCCACTG 

ACAAGTGGGAGGTGGAACATGGTGTCATAACCAGGTTAGCTAAGAGATATACT 

GGGGTCGGGTTCAATGGTGCATACTTAGGTGACGAGCCCAATCACCGTGCTCT 

AGTGGAGAGGGACTGTGCAACTATAACCAAAAACACAGTACAGTTTCTAAAAAT 

GAAGAAGGGGTGTGCGTTCACCTATGACCTGACCATCTCCAATCTGACCAGGC 

TCATCGAACTAGTACACAGGAACAATCTTGAAGAGAAGGAAATACCCACCGCT 

ACGGTCACCACATGGCTAGCTTACACCTTCGTGAATGAAGACGTAGGGACTAT 

AAAACCAGTACTAGGAGAGAGAGTAATCCCXrGACCCTGTAGTTGATATCAATTT 

ACAACCAGAGGTGCAAGTGGACACGTCAGAGGTTGGGATCACAATAATTGGAA 

GGGAAACCCTGATGACAACGGGAGTGACACCTGTCTTGGAAAAAGTAGAGCCT 

GACGCCAGCGACAACCAAAACTCGGTGAAGATCGGGTTGGATGAGGGTAA1TA 

CCCAGGGCCTGGAATACAGACACATACACTAACAGAAGAAATACACAACAGGG 

ATGCGAGGCCCTTCATCATGATCCTGGGCTCAAGGAATTCCATATCAAATAGGG 

CAAAGACTGCTAGAAATATAAATCTGTACACAGGAAATGACCCCAGGGAAATA 

CGAGACTTGATGGCTGCAGGGCGCATGTTAGTAGTAGCACTGAGGGATGTCGA 

CCCTG AGCTGTCTG AAATGGTCG ATTTC AAGGGG AC i'l ' 1 ' i'l ' I AG AT AGGG AGG 

CCCTGGAGGCTCTAAGTCTCGGGCAACCTAAACCGAAGCAGGTTACCAAGGAA 

GCTGTTAGGAATTTGATAGAACAGAAAAAAGATGTGGAGATCCCTAACTGGTTT 

GCATCAGATGACCCAGTATTTCTGGAAGTGGCCTTAAAAAATGATAAGTACTAC 

TTAGTAGGAGATGTTCKjAGAGCTAAAAGATCAAGCTAAAGCACTTGGGGCCAC 

GGATCAGACAAGAATTATAAAGGAGGTAGGCTCAAGGACGTATGCCATGAAGC 

TATCTAGCTGGTTCCTCAAGGCATCAAACAAACAGATGAGTTTAACTCCACTGT 

TTGAGGAATTGTTGCTACGGTGCCCACCTGCAACTAAGAGCAATAAGGGGCAC 

ATGGCATCAGCTTACCAATTGGCACAGGGTAACTGGGAGCCCCTCGGTTGCGG 

GGTGCACCTAGGTACAATACCAGCCAGAAGGGTGAAGATACACCCATATGAAG 

CTTACCTGAAGTTGAAAGATTTCATAGAAGAAGAAGAGAAGAAACCTAGGGTT 

AAGGATACAGTAATAAGAGAGCACAACAAATGGATACTTAAAAAAATAAGGTTT 

CAACKjAAACCTCAACACCAAGAAAATGCTCAACCCGGGGAAACTATCTGAACA 

GTTGGACAGGGAGGGGCGCAAGAGGAACATCTACAACCACCAGATTGGTACT 

ATAATGTCAAGTGCAGGCATAAGGCTGGAGAAATTGCCAATAGTGAGGGCCCA 

AACCGACACCAAAACCTTTCATGAGGCAATAAGAGATAAGAT AGAC AAGAGTG 

AAAACCGGCAAAATCCAGAATTGCAC^CAAATTGTTGGAGATTTTCCACACGA 

tagcccaacccaccctgaaacacacctacggtgaggtgacgtgggagcaactt 

gaggcggggataaatagaaagggggcagcaggcttcctggagaagaagaaca 

tcggagaagtattggattcagaaaagcacctggtagaacaattggtcagggat 

ctgaaggccgggagaaagataaaatattatgaaactgcaataccaaaaaatga 

gaagagagatgtcagtgatgactggcaggcaggggacctggtggttgagaag 

aggccaagagttatccaataccctgaagccaagacaaggctagccatcactaa 

ggtcatgtataactgggtgaa acagc agcccgttgtgattccaggatatgaag 

gaaagacccccttgttcaacatctttgataaagtgagaaaggaatgggactcgt 

tcaatgagccagtggccgtaagttttgacaccaaagcctgggacactcaagtg 

actagtaaggatctgcaacttattggagaaatccagaaatattactataagaag 

gagtggcacaagttcattgacaccatcaccgaccacatgacagaagtaccagt 
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TATAACAGCAGATGGTGAAGTATATATAAGAAATGGGCAGAGAGGGAGCGGC 

CAGCCAGACACAAGTGCTGGCAACAGCATGTTAAATGTCCTGACAATGATGTA 

CGGCTTCTGCGAAAGCACAGGGGTACCGTACAAGAGTTTCAACAGGGTGGCAA 

GGATCCACGTCTGTGGGGATGATGGCTTCTTAATAACTGAAAAAGGGTTAGGG 

CTGAAATTTGCTAACAAAGGGATGCAGATTCTTCATGAAGCAGGCAAACCTCAG 

AAGATAACGKjAAGGGGAAAAGATGAAAGTTGCCTATAGATTTGAGGATATAGA 

GTTCTGTTCTCATACCCCAGTCCCTGTTAGGTGGTCCGACAACACCAGTAGTCA 

CATGGCCGGGAGAGACACCGCTGTGATACTATCAAAGATGGCAACAAGATTGG 

ATTCAAGTGGAGAGAGGGGTACCACAGCATATGAAAAAGCGGTAGCCTTCAGT 

TTCTTGCTGATGTATTCCTGGAACCCGCTTGTTAGGAGGATTTGCCTGTTGGTC 

CTTTCGCAACAGCCAGAGACAGACCCATCAAAACATGCCACTTATTATTACAAA 

GGTGATCCAATAGGGGCCTATAAAGATGTAATAGGTCGGAATCTAAGTGAACT 

GAAGAGAACAGGCTTTGAGAAATTCKjCAAATCTAAACCTAAGCCTGTCCACGTT 

GGGGATCTGGACTAAGCACACAAGCAAAAGAATAATTCAGGACTGTGTTGCCA 

TTGGGAAAGAAGAGGGCAACTGGCTAGTTAACGCCGACAGGCTGATATCCAGC 

AAAACTGGCCACTrATACATACCTGATAAAGGCTTTACATTACAAGGAAAGCAT 

TATGAGCAACTGCAGCTAAGAACAGAGACAAACCCGGTCATGGGGGTTGGGA 

CTGAGAGATACAAGTTAGGTCCCATAGTCAATCTGCTGCTGAGAAGGTTGAAA 

ATTCTGCTCATGACGGCCGTCGGCOTCAGCAGCTGAaggttggggtaaacactccggcctcttag 

cttccttetttaatggtggctccatcttagcccagt^ 

ggcctctctgcagatcatgtCCCCCGGCCGTCGGCGTCAGCTGAgacaaaatgtatatattgtaaataaattaatc 



atctacctcaagataacactacatttaatgcacacagcactttagctgtatgaggatacgcccgacgtctatagttggactagggaagacct 
ctaacagccccc 
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50/6^ 



Gtatacgagaattagaaaaggcactcgtatacgtattgggcaattaaaaataataattaggcctaggtacatggcacgtgccagccccct 

gatgggggcgacactccaccatgaatcactcccctgtgaggaactactgtcttcacgcagaaagcgtctagccatggcgttagtatgag 

tgtcgtgcagcctccaggaccccccctcccgggagagccatagtggtctgcggaaccggtgagtacaccggaattgccaggacgac 

cgggtcctttcttggataaacccgctcaatgcctggagatttgggcgtgcccccgcaagactgctagccgagtagtgttgggtcgcgaa 

aggccttgtggtactgcctgatagggtgcngcgagtgccccgggaggtctcgtagaccgtgcaccATGAGCACGAATC 

CTAAACCTCAAAGAAAAACCAAACGTAACACCAACCGTCGCCCACAGGACGTC 

AAGTTCCCGGGTGGCGGTCAGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAG 

GGGCCCTAGATTGGGTGTGCGCGCGACGAGGAAGACTTCCGAGCGGTCGCAA 

CCTCGAGGTAGACGTCAGCCTATCCCCAAGGCACGTCGGCCCGAGGGCAGGA 

CCTGGGCTCAGCCCGGGTACCCTTGGCCCCTCTATGGCAATGAGGGTTGCGGG 

TGGGCGGGATGGCTCCTGTCTCCCCGTGGCTCTCGGCCTAGCTGGGGCCCCAC 

AGACCCCCGGCGTAGGTCGCGCAATTTGGGTAAGGTCATCGATACCCTTACGT 

ck:ggcttcck;cgacctcatggggtacataccgctcgtcggcgcccctcttgga 
ggcgctgccagggccctggcgcatggcgt ccgg gttctggaagacggcgtga 

ACTATGCAACAGGGAACCITCCTGGTTGCTCTTTCTCTATCTTCCnr^ 

G CTCT CTTGCCTGACCGTGCCCGCTTCAGCCTACCAAGTGCGCAATTCCTCGGG 

GCTTTACCATGTCACCAATGATTGCCCTAACTCGAGTATTGTGTACGAGGCGGC 

CGATGCCATCCTGCACACTCCGGGGTGTGTCCCTTGCGTTCGCGAGGGTAACG 

CCTCGAGGTGTTGGGTGGCGGTGACCCCCACGGTGGCCACCAGGGACGGCAA 

ACTCCCCACAACGCAGCTTCGACGTCATATCGATCTGCTTGTC GGGAG CGCCA 

CCCTCTGCTCGGCCCTCTACGTGGGGGACCTGTGCGGGTCTGTCTTTCTTGTTG 

GTCAACTGTTTACCTTCTCTCCCAGGCGCCACTGGACGACGCAAGACTGCAATT 

GTTCTATCTATCCCGGCCATATAACGGGTCATCGCATGGCATGGGATATGATGA 

TGAACTGGTCCCCTACGGCAGCGTTGGTGGTAGCTCAGCTGCTCCGGATCCCA 

CAAGCCATCATGGACATGATCGCTGGTGCTCACTGGGGAGTCCTGGCGGGCAT 

AGCGTATTTCTCCATGGTGGGGAACTGGGCGAAGGTCCTGGTAGTGCTGCTGC 

TATTTGCCGGCGTCGACGCGGAAACCCACGTCACCGGGGGAAGTGCCGGCCG 

CACCACGGCTGGGCTTGTTGGTCTCCTTACACCAGGCGCCAAGCAGAACATCC 

AACTGATCAACACCAACGGCAGTTGGCACATCAATAGCACGGCCTTGAACTGC 

AATGAAAGCCTrAACACCGGCTGGTTAGCAGGGCTCTTCTATCAGCACAAATTC 

AACTCTTCAGGCTGTCCTGAGAGGTTGGCCAGCTGCCGACGCCTTACCGATTTT 

GCCCAGGGCTGGGGTCCTATCAGTTATGCCAACGGAAGCGGCCTCGACGAAC 

GCCCCTACTGCTGGCACrACCCTCCAAGACCTTGTGGCATTGTGCCCGCAAAG 

AGCGTGTGTGGCCCGGTATATTGCTTCACTCCCAGCCCCGTGGTGGTGGGAAC 

GACCGACAGGTCGGGCGCGCCTACCTACAGCTGGGGTGCAAATGATACGGAT 

GTCTTCGTCCTTAACAACACCAGGCCACCGCTGGGCAATTGGTTCGGTTGTACC 

TGGATGAACTCAACTGGATTCACCAAAGTGTGCGGAGCGCCCCCTTGTGTCAT 

CGGAGGGGTGGGCAACAACACCTTGCTCTGCCCCACTGATTGTTTCCGCAAGC 

ATCCGGAAGCCACATACTCTCGGTGCGGCTCCGGTCCCTGGATTACACCCAGG 

TGCATGGTCGACTACCCGTATAGGCTTTGCTCACTATCCTrGTACCATCAATTAC 

ACCATATTCAAAGTCAGGATGTACGTGGGAGGGGTCGAGCACAGGCTGGAAG 

CGGCCTGCAACTGGACGCGGGGCGAACGCTGTGATCTGGAAGACAGGGACAG 

GTCCGAGCTCAGCCCATTGCTGCTGTCCACCACACAGTGGCAGGTCCTTCCGT 

GTTCTTTCACGACCCTGCCAGCCTTGTCCACCGGCCTCATCCACCTCCACCAGA 

ACATTGTGGACGTGCAGTACTTGTACGGGGTAGGGTCAAGCATCGCGTCCTGG 

GCCATTAAGTGGGAGTACGTCGTTCTCCTGTTCCTCCTGCTTGCAGACGCGCGC 

GTCTGCTCCTGCTTGTGGATGATGTTACTCATATCCCAAGCGGAGGCGGCTTTG 

GAGAACCTCGTAATACTCAATGCAGCATCCCTGGCCGGGACGCACGGTCTTGT 

GTCCTTCCTCGTGTTCTTCTGCTTTGCGTGGTATCTGAAGGGTAGGTGGGTGCC 
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CGGAGCGGTCTACGCCTTCTACGGGAAGTGGGTCTTACTCTTATACCACATCTT 

AGTGGTACACCCAATCAAATCTGTAATTGTGATCCTACTGATGATTGGGGATGT 

GGTAAAGGCCGATTCAGGGGGCCAAGAGTACTTGGGGAAAATAGACCTCTGTT 

TTACAACAGTAGTACTAATCGTCATAGGTTTAATCATAGCTAGGCGTGACCCAA 

CTATAGTGCCACTGGTAACAATAATGGCAGCACTGAGGGTCACTGAACTGACC 

CACCAGCCTGGAGTTGACATCGCTGTGGCGGTCATGACTATAACCCTACTGAT 

GGTTAGCTATGTGACAGATTATTTTAGATATAAAAAATGGTTACAGTGCATTCTC 

AGCCTGGTATCTGCGGTGTTCTTGATAAGAAGCCTAATATACCTAGGTAGAATC 

GAGATGCCAGAGGTAACTATCCCAAACTGGAGACCACTAACTTTAATACTATTA 

TATTTGATCTCAACAACAATTGTAACGAGGTGGAAGGTTGACGTGGCTGGCCTA 

TTGTTGCAATGTGTGCCTATCTTATTGCTGGTCACAACCTTGTGGGCCGACTTCT 

TAACCCTAATACTGATCCTGCCTACCTATGAATTGGTTAAATTATACTATCTGAA 

AACTGTTAGGACTGATACAGAAAGAAGTTGGCTAGGGGGGATAGACTATACAA 

GAGTTGACTCCATCTACGACGTTGAT GAGAG TGGAGAGGGCGTATATCri'i'1'l'C 

CATCAAGGCAGAAAGCACAGGGGAATTTTTCTATACTCTTGCCCCTTATCAAAG 

CAACACTGATAAGTTGCGTCAGCAGTAAATGGCAGCTAATATACATGAGTTACT 

TAACTTTGGACTTTATGTACTACATGCACAGGAAAGTTATAGAAGAGATCTCAG 

GAGGTACCAACATAATATCCAGGTTAGTGGCAGCACTCATAGAGCTGAACTGG 

TCCATGGAAGAAGAGGAGAGCAAAGGCTTAAAGAAGTTTTATCTATTGTCTGG 

AAGGTTGAGAAACCTAATAATAAAACATAAGGTAAGGAATGAGACCGTGGCTT 

CTTGGTACGGGGAGGAGGAAGTCTACGGTATGCCAAAGATCATGACTATAATC 

AAGGCCAGTACACTGAGTAAGAGCAGGCACTGCATAATATGCACTGTATGTGA 

GGGCCGAGAGTGGAAAGGTGGCACCTGC CCAA AATGTGGACGCCATGGGAAG 

CCGATAACGTGTGGGATGTCG CTAG CAGATTTTGAAGAAAGACACTATAAAAG 

AATCTTTATAAGGGAAGGCAACTTTGAGGGTATGTGCAGCCGATGCCAGGGAA 

AGCATAGGAGGTTTGAAATGGACCGGGAACCTAAGAGTGCCAGATACTGTGCT 

GAGTGTAATAGGCTGCATCCTGCTGAGGAAGGTGACTTTTGGGCAGAGTCGAG 

CATGTTGGGCCTCAAAATCACCTACTTTGCGCTGATGGATGGAAAGGTGTATGA 

TATCACAGAGTGGGCTGGATGCCAGCGTGTGGGAATCTCCCCAGATACCCACA 

GAGTCCCTrGTCACATCTCATTTGGTTC^CGGATGCCTTTCAGGCAGGAATACA 

ATGGCTTTGTACAATATACCGCTAGGGGGCAACTATTTCTGAGAAACTTGCCCG 

TACTGGCAACTAAAGTAAAAATGCTCATGGTAGGCAACCTTGGAGAAGAAATT 

GGTAATCTGGAACATCTTGGGTGGATCCTAAGGGGGCCTGCCGTGTGTAAGAA 

GATCACAGAGCACGAAAAATGCCACATTAATATACTGGATAAACTAACCGCATT 

TTTCGGGATCATGCCAAGGGGGACTACACCCAGAGCCCCGGTGAGGTTCCCTA 

CGAGCTTACTAAAAGTGAGGAGGGGTCTGGAGACTGCCTGGGCTTACACACAC 

CAAGGCGGGATAAGTTCAGTCGACCATGTAACCGCCGGAAAAGATCTACTGGT 

CTGTGACAGCATGGGACGAACTAGAGTGGTTTGCCAAAGCAACAACAGGTTGA 

CCGATGAGACAGAGTATGGCGTCAAGACTGACTCAGGGTGCCCAGACGGTGC 

CAGATGTTATGTGTTAAATCCAGAGGCCGTTAACATATCAGGATCCAAAGGGG 

CAGTCGTTCACCTCCAAAAGACAGGTGGAGAATTCACGTGTGTCACCGCATCA 

GGCACACCGGCTTTCTTCGACCTAAAAAACITGAAAGGATGGTCAGGCTTGCCT 

ATATTTGAAGCCTCCAGCGGGAGGGTGGTTGGCAGAGTCAAAGTAGGGAAGA 

ATGAAGAGTCTAAACCTACAAAAATAATGAGTGGAATCCAGACCGTCTCAAAAA 

ACAGAGCAGACCtGACC GAGA TGGTCAAGAAGATAACCAGCATGAACAGGGG 

AGACTTCAAGCAGATTACTTTGGCAACAGGGGCAGGCAAAACCACAGAACTCC 

CAAAAGCAGTTATAGAGGAGATAGGAAGACACAAGAGAGTATTAGTTCTTATA 

CCATTAAGGGCAGCGGCAGAGTCAGTCTACCAGTATATGAGATTGAAACACCC 

AAGCATCTCTTTTAACCTAAGGATAGGGGACATGAAAGAGGGGGACATGGCAA 

CCGGGATAACCTATGCATCATACGGGTACTTCTGCCAAATGCCTCAACCAAAGC 

TCAGAGCTGCTATGGTAGAATACTCATACATATTCTTAGA TGAAT ACCATTGTGC 

CACTCCTGAACAACTGGCAATTATCGGGAAGATCCACAGATTTTCAGAGAGTAT 

AAGGGTTGTCGCCATGACTGCCACGCCAGCAGGGTCGGTGACCACAACAGGT 

CAAAAGCACCCAATAGAGGAATTCATAGCCCCCGAGGTAATGAAAGGGGAGG 
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ATCTTGGTAGTCAGTTCC TTGATA TAGCAGGGTTAAAAATACCAGTGGATGAGA 

TGAAACKjCAATATGTTGGTTTTTGTACCAACGAGAAACATGGCAGTAGAGGTA 

GCAAAGAAGCTAAAAGCTAAGGGCTATAACTCTGGATACTATTACAGTGGAGA 

GGATCCAGCCAATCTGAGAGTTGTGACATCACAATCCCCCTATGTAATCGTGGC 

TACAAATGCTATTGAATCAGGAGTGACACTACCAGATTTGGACACGGTTATAGA 

CACGGGGTTGAAATGTGAAAAGACKjGTGAGGGTATCATCAAAGATACCCTTCA 

TCGTAACAGGCCTTAAGAGGATGGCCGTGACTGTGGGTGAGCAGGCGCAGCG 

TAGGGGCAGAGTAGGTAGAGTGAAACCCGGGAGGTATTATAGGAGCCAGGAA 

ACAGCAACAGGGTCAAAGGACTACCACTATGACCTCTTGCAGGCACAAAGATA 

CGGGATTGAGGATGGAATCAACGTGACGAAATCCTTTAGGGAGATGAATTACG 

ATTGGAGCCTATACGACKjAGGACAGCCTACTAATAACCCAGCTGGAAATACTA 

AATAATCTACTCATCTCAGAAGACTTGCCAGCCGCTGTTAAGAACATAATGGCC 

AGGACTGATCACCCAGAGCCAATCCAACTTGCATACAACAGCTATGAAGTCCA 

GGTCCCGGTCCTATTCCCAAAAATAAGGAATGGAGAAGTCACAGACACCTACG 

AAAATTACTCGTTTCTAAATGCCAGAAAGTTAGGGGAGGATGTGCCCGTGTATA 

TCTACGCTACTGAAGATGAGGATCTGGCAGTTGACCTCITAGGGCTAGACTGG 

CCTGATCCTGGGAACCAGCAGGTAGTGGAGACTGGTAAAGCACTGAAGCAAGT 

GACCGCK3TTGTCCTCGGCTGAAAATGCCCTACTAGTGGCTTTATTTGGGTATGT 

GGGTTACCAGGCTCTCTCAAAGAGGCATGTCCCAATGATAACAGACAtATATAC 

CATCGAGGACCAGAGACTAGAAGACACCACCCACCTCCAGTATGCACCCAACG 

CCATAAAAACCGATGGGACAGAGACrGAACTGAAAGAACTGGCGTCGGGTGA 

CGTGGAAAAAATCATGGGAGCCATTTCAGATTATGCAGCTGGGGGACTGGAGT 

TTGTTAAATCCCAAGCAGAAAAGATAAAAACAGCTCCTTTGTTTAAAGAAAACG 

CAGAAGCCGCAAAAGGGTATGTCCAAAAATTCATTGACTCATTAATTGAAAATA 

AAGAAGAAATAATCAGATATGGTTTGTGGGGAACACACACAGCACTATACAAA 

AGCATAGCTGCAAGACTGGGGCATGAAACAGCGTTTGCCACACTAGTGTTAAA 

GTGGCTAGCTTTTGGAGGGGAATCAGTGTCAGACCACGTCAAGCAGGCGGCA 

GTTGATTTAGTGGTCTATTATGTGATGAATAAGCCTTCCTTCCCAGGTGACTCC 

GAGACACAGCAAGAAGGGAGGCGATTCGTCGCAAGCCTGTTCATCTCCGCACT 

GGCAACCTACACATACAAAACTTGGAATTACCACAATCTCTCTAAAGTGGTGGA 

ACCAGCCCTGGCrrACCTCCCCTATGCTACCAGCGCATTAAAAATGTTCACCCC 

AACGCGGCTGGAGAGCGTGGTGATACTGAGCACCACGATATATAAAACATACC 

TCTCTATAAGGAAGGGGAAGAGTGATGGATTGCTGGGTACGGGGATAAGTGC 

AGCCATGGAAATCCTGTCACAAAACCCAGTATCGGTAGGTATATCTGTGATGTT 

GGGGGTAGGGGCAATCGCTGCGGACAACGCTATTGAGTCCAGTGAACAGAAA 

AGGACCCTACTTATGAAGGTGTTTGTAAAGAACTTCTTGGATCAGGCTGCAACA 

GATGAGCTGGTAAAAGAAAACCCAGAAAAAATTATAATGGCCTTATTTGAAGCA 

GTCCAGACAATTGGTAACCCCCTGAGACTAATATACCACCTGTATGGGGTTTAC 

TACAAAGGTTGGGAGGCCAAGGAACTATCTGAGAGGACAGCAGGCAGAAACT 

TATTCACATTGATAATGTTTGAAGCCTTCGAGTTATTAGGGATGGACTCACAAG 

GGAAAATAAGGAACCTGTCCGGAAATTACATTTTGGATTTGATATACGGCCTAC 

ACAAGCAAATCAACAGAGGGCTGAAGAAAATGGTACTGGGGTGGGCCCCTGC 

ACCCTTTAGTTGTGACTGGACCCCTAGTGACGAGAGGATCAGATTGCCAACAG 

ACAACTATTTGAGGGTAGAAACCAGGTGCCCATGTGGCTATGAGATGAAAGCT 

TTCAAAAATGTAGGTGGCAAACTTACCAAAGTGGAGGAGAGCGGGCCTTTCCT 

ATGTAGAAACAGACCTGGTAGGGGACCAGTCAACTACAGAGTCACCAAGTATT 

ACGATGACAACCTCAGAGAGATAAAACCAGTAGCAAAGTTGGAAGGACAGGTA 

GAGCACTACTACAAAGGGGTCACAGCAAAAATTGACTACAGTAAAGGAAAAAT 

GCTCTTGGCCACTGACAAGTGGGAGGTGGAACATGGTGTCATAACCAGGTTAG 

CTAAGAGATATACTGGGGTCGGGTTCAATGGTGCATACTTAGGTGACGAGCCC 

AATCACCGTGCTCTAGTGGAGAGGGACTGTGCAACTATAACCAAAAACACAGT 

ACAGTTTCTAAAAATGAAGAAGGGGTGTGCGTTCACCTATGACCTGACCATCTC 

CAATCTGACCAGGCTCATCGAACTAGTACACAGGAACAATCTTGAAGAGAAGG 

AAATACCCACCGCTACGGTCACCACATGGCTAGCTTACACCTTCGTGAATGAAG 
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CACTGTTTGAGGAATTGTTGCTACGGTGCCCACCTGCAACTAAGAGCAATAAG 

GM3GCACATGGCATCAGCTTACCAATTGGCACAGGGTAACTGGGAGCCCCTCGG 

TTGCGGGGTGCACCTAGGTACAATACCAGCCAGAAGGGTGAAGATACACCCAT 

ATGAAGCTTACCTGAAGTTGAAAGATTTCATAGAAGAAGAAGAGAAGAAACCT 

AGGGTTAAGGATACAGTAATAAGAGAGCACAACAAATGGATACTTAAAAAAAT 

AAGGTTTCAAGGAAACCTCAACACCAAGAAAATGCTCAACCCTGGGAAACTATC 

TGAACAGTTGGACAGGGAGGGGCGCAAGAGGAACATCTACAACCACCAGATT 

GGTACTATAATGTCAAGTGCA GGCA TAAGGCTGGAGAAATTGCCAATAGTGAG 

GGCCCAAACCGACACCAAAACCTTTCATGAGGCAATAAGAGATAAGAT AGAC A 

AGAGTGAAAACCGGCAAAATCCAGAATTGCACAACAAATTGTTGGAGATTTTCC 

ACACGATAGCCCAACCCACCCTGAAACACACCTACGGTGAGGTGACGTGGGAG 

CAACTTGAGGCGGGGATAAATAGAAAGGGGGCAGCAGGCTTCCTGGAGAAGA 

AGAACATCGGAGAAGTATTGGATTCAGAAAAGCACCTGGTAGAACAATTGGTC 

ACKjGATCTGAAGGCCGGGAGAAAGATAAAATATTATGAAACTGCAATACCAAA 

AAATGAGAAGAGAGATGTCAGTGATGACTGGCAGGCAGGGGACCTGGTGGTT 

GAGAAGAGGCCAAGAGTTATCCAATACCCTGAAGCCAAGACAAGGCTAGCCAT 

CACTAAGGTCATGTATAACTGGGTGAAACAGCAGCCCGTTGTGATTCCAGGAT 

ATGAAGGAAAGACCCCCTTGTTCAACATC nTGA TAAAGTGAGAAAGGAATGG 

GACTCGTTCAATGAGCCAGTGGCCGTAAGTTTTGACACCAAAGCCTGGGACAC 

TCAAGTGACTAGTAAGGATCTGCAACrrATTGGAGAAATCCAGAAATATTACrA 

TAAGAAGGAGTGGCACAAGTTCATTGACACXrATCACCGACCACATGACAGAAG 

TACCAGTTATAACAGCAGATGGTGAAGTATATATAAGAAATGGGCAGAGAGGG 

AGCGGCCAGCCAGACACAAGTGCTGGCAACAGCATGTTAAATGTCCTGACAAT 

GATGTACGCCTTCTGCGAAAGCACAGGGGTACCGTACAAGAGTTTCAACAGGG 

TGGCAAGGATCCACGTCTGTGGGGATGATGGCTTCTTAATAACTGAAAAAGGG 

TTAGGGCTGAAATTTGCTAACAAAGGGATGCAGATTCTTCATGAAGCAGGCAA 

ACCTCAGAAGATAACGGAAGGGGAAAAGATGAAAGTTGCGTATAGATTTGAGG 

ATATAGAGTTCTGTTCTCATACCCCAGTCCCTGTTAGGTGGTCCGACAACACCA 

GTAGTCACATGGCCGGGAGAGACACCGCTGTGATACTATCAAAGATGGCAACA 

AGATTGGATTCAAGTGGAGAGAGGGGTACCACAGCATATGAAAAAGCGGTAG 

CCTTCAGTTTCTrGCTGATGTATTCCTGGAACCCGCTTGTTAGGAGGATTTGCCT 

GTrGGTCCTTTCGCAACAGCCAGAGACAGACCCATCAAAACATGCCACTTATTA 

TTACAAAGGTGATCCAATAGGGGCCrATAAAGATGTAATAGGTCGGAATCTAA 

GTGAACTGAAGAGAACAGGCTTTGAGAAATTGGCAAATCTAAACCTAAGCCTG 

TCCACGTTGGGGATCTGGACTAAGCACACAAGCAAAAGAATAATTCAGGACTG 

TGTTGCCATTGGGAAAGAAGAGGGCAACTGGCTAGTTAA CGCC GACAGGCTGA 

TATCCAGCAAAACTGGCCACITATACATACCTGATAAAGGCTTTACATTACAAG 

GAAAGCATTATGAGCAACTGCAGCTAAGAACAGAGACAAACCCGGTCATGGGG 

GTTGGGACTGAGAGATACAAGTTAGGTCCCATAGTCAATCTGCTGCTGAGAAG 

GTTGAAAATTCTGCrCATGACGGCCGTCGGCGTCAGCAGCTGAgacaaaatgtatatattgt 

aaataaattaatccatg^catagtgtatataaatatagttgggaccgtccacctcaagaagacgacacgccca^ 

agtagtcaagattatctacctcaagamcactaratttaatg^ 

tagggaagacctctaacagccccc 
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ACTGAAGATGAGGATCTGGCAGTTGACCTCTTAGGGCTAGACTGGCCTGATCC 

TGGGAACCAGCAGGTAGTGGAGACTGGTAAAGCACTGAAGCAAGTGACCGGG 

TTGTCCTCGGCTGAAAATGCCCTACTAGTGGCTTTATTTGGGTATGTGGGTTAC 

CAGGCTCTCTCAAAGAGGCATGTCCCAATGATAACAGACATATATACCATCGAG 

GACCAGAGACTAGAAGACACCACCCACCTCCAGTATGCACCCAACGCCATAAA 

AACCGATGGGACAGAGACTGAACTGAAAGAACTGGCGTCGGGTGACGTGGAA 

AAAATCATGGGAGCCATTTCAGATTATGCAGCTGGGGGACTGGAGTTTGTTAA 

ATCCCAAGCAGAAAAGATAAAAACAGCTCCTITGTTTAAAGAAAACGCAGAAGC 

CGCAAAAGGGTATGTCCAAAAATTCATTGACTCATTAATTGAAAATAAAGAAGA 

AATAATCAGATATGGTTTGTGGGGAACACACACAGCACTATACAAAAGCATAGC 

TGGVAGACTGGGGCATGAAACAGCGTTTGCCACACTAGTGTTAAAGTGGCTAG 

CrTTTGGAGGGGAATCAGTGTCAGACCACGTCAAGCAGGCGGCAGTTGATTTA 

GTGGTCTATTATGTGATGAATAAGCCTTCCTTCCCAGGTGACTCCGAGACACAG 

CAAGAAGGGAGGCGATTCGTCGCAAGCCTGTTCATCTCCGCACTGGCAACCTA 

CACATACAAAACTrGGAATTACCACAATCTCTCTAAAGTGGTGGAACCAGCCCT 

GGCTTACCTCCCCTATGCTACCAGCGCATTAAAAATGTTCACCCCAACGCGGCT 

GGAGAGCGTGGTGATACTGAGCACCACGATATATAAAACATACCTCTCTATAAG 

GAAGGGGAAGAGTGATGGATTGCTGGGTACGGGGATAAGTGCAGCCATGGAA 

ATCCTGTCACAAAACCCAGTATCGGTAGGTATATCTGTGATGTTGGGGGTAGG 

GGCAATCGCTGCGCACAACGCTATTGAGTCCAGTGAACAGAAAAGGACCCTAC 

TTATGAAGGTGTTTGTAAAGAACTTCTTGGATCAGGCTGCAACAGATGAGCTGG 

TAAAAGAAAACCCAGAAAAAATTATAATGGCCTTATTTGAAGCAGTCCAGACAA 

TTGGTAACCCCCTGAGACTAATATACCACCTGTATGGGGTTTACTACAAAGGTT 

GGGAGGCCAAGGAACTATCTGAGAGGACAGCAGGCAGAAACTTATTCACATTG 

ATAATGTTTGAAGCCTTCGAGTTATTAGGGATGGACTCACAAGGGAAAATAAG H? 

GAACCTGTCCGGAAATTACATITrGGATTTGATATACGGCCTACACAAGCAAAT ^ 

CAACAGAGGGCTGAAGAAAATGGTACTGGGGTGGGCCCCTGCACCCTTTAGTT u 

GTGACTGGACCCCTAGTGACGAGAGGATCAGATTGCCAACAGACAACTATTTG 2 

AGGGTAGAAACCAGGTGCCCATGTGGCTATGAGATGAAAGCTTTCAAAAATGT P 

AGGTGGCAAACTTACCAAAGTGGAGGAGAGCGGGCCTTTCCTATGTAGAAACA g 

GACCTGGTAGGGGACCAGTCAACTACAGAGTCACCAAGTATTACGATGACAAC to 

CTCAGAGAGATAAAACCAGTAGCAAAGTTGGAAGGACAGGTAGAGCACTACTA 

CAAAGGGGTCACAGCAAAAATTGACTACAGTAAAGGAAAAATGCTCTTGGCCA 

CTGACAAGTGGGAGGTGGAACATGGTGTCATAACCAGGTTAGCTAAGAGATAT 

ACTGGGGTCGGGTTCAATGGTGCATACTTAGGTGACGAGCCCAATCACCGTGC 

TCrAGTCGAGAGGGACTGTGCAACTATAACCAAAAACACAGTACAGTrTCTAAA 

AATGAAGAAGGGGTGTGCGTTCACCTATGACCTGACCATCTCCAATCTGACCA 

GGCTCATCGAACTAGTACACAGGAACAATCTTGAAGAGAAGGAAATACCCACC 

GCrACGGTCACCACATGGCTAGCTTACACCTTCGTGAATGAAGACGTAGGGAC 

TATAAAACCAGTACTAGGAGAGAGAGTAATCCCCGACCCTGTAGTTGATATCAA 

TTTACAACCAGAGGTGCAAGTGGACACGTCAGAGGTTGGGATCACAATAATTG 

GAAGGGAAACCCTGATGACAACGGGAGTGACACCTGTCTTGGAAAAAGTAGA 

GCCTGACGCCAGCGACAACCAAAACTCGGTGAAGATCGGGTTGGATGAGGGT 

AATTACCCAGGGCCTGGAATACAGACACATACACTAACAGAAGAAATACACAA 

CAGGGATGCGAGGCCCTTCATCATGATCCTGGGCTCAAGGAATTCCATATCAA 

ATAGGGCAAAGACTGCTAGAAATATAAATCTGTACACAGGAAATGACCCCAGG 

GAAATACGAGACTTGATGGCTGCACKjGCGCATGTTAGTAGTAGCACTGAGGGA 

TGTCGACCCTGAGCTGTCTGAAATGGTCGATTTCAAGGGGACTTTnTAGATAG 

GGAGGCCCTGGAGGCTCTAAGTCTCGGGCAACCTAAACCGAAGCAGGTTACCA 

AGGAAGCTGTTAGGAATTTGATAGAACAGAAAAAAGATGTGGAGATCCCTAAC 

TGGTTTGCATCAGATGACCCAGTATTTCTGGAAGTGGCCTTAAAAAATGATAAG 

TACTACnTAGTAGGAGATGTTGGAGAGGTAAAAGATCAAGCTAAAGCACTTGG 

GGCCACGGATCAGACAAGAATTATAAAGGAGGTAGGCTCAAGGACGTATGCCA 

TGAAGCTATCTAGCTGGTTCCTCAAGGCATCAAACAAACAGATGAGTTTAACTC 
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ACTGATAAGTTGCGTCAGCAGTAAATGGCAGCTAATATACATGAGTTACTTAAC 

TTTGGACTTTATGTACTACATGCACAGGAAAGTTATAGAAGAGATCTCAGGAGG 

TACCAACATAATATCCAGGTTAGTGGCAGCAC TCATA GAGCrGAACTGGTCCAT 

GGAAGAAGAGGAGAGCAAAGGCTTAAAGAAGTTTTATCTATTGTCTGGAAGGT 

TG AG AAACCT AAT AAT AAAAC ATAAGGT AAGG AATG AG ACCGTGGC null GGT 

ACGGGGAGGAGGAAGTCTACGGTATGCCAAAGATCATGACTATAATCAAGGCC 

AGTACACTGAGTAAGAGCAGGCACTGCATAATATGCACTGTATGTGAGGGCCG 

AGAGTGGAAAGGTGGCACCTGCCCAAAATGTGGACGCCATGGGAAGCCG ATA 

ACGTGTGGGATGTCGCTAGCAGATTTTGAAGAAAGACACTATAAAAGAATCTTT 

ATAAGGGAAGGCAACTTTGAGGGTATGTGCAGCCGATGCCAGGGAAAGCATA 

GGAGGTTTGAAATGGACCGGGAACCTAAGAGTGCCAGATACTGTGCTGAGTGT 

AATAGGCTGCATCCTGCTGAGGAAGGTGACTTTTGGGCAGAGTCGAGCATGTT 

GGGCCTCAAAATCACCTACTTTGCGCTGATGGATGGAAAGGTGTATGATATCAC 

AGAGTGGGCTGGATGCCAGCGTGTGGGA ATCTC CCCAGATACCCACAGAGTCC 

CTTGTCACATCrCATTTGGTTCACGGATGCCTTTCAGGCAGGAATACAATGGCT 

TTGTACAATATACCGCTAGGGGGCAACTATTTCTGAGAAACTrrcCCCGTACTGG 

CAACTAAAGTAAAAATGCTCATGGTAGGCAACCTTGGAGAAGAAATTGGTAATC 

TGGAACATCTTGGGTGGATCCTAAGGGGGCCTGCCGTGTGTAAG AAGAT CACA 

GAGCACGAAAAATGCCACATTAATATACTGGATAAACTAACCGCATTTTTCGGG 

ATCATGCCAAGGGGGACTACACCCAGAGCCCCGGTGAGGTTCCCTACGAGCTT 

ACTAAAAGTGAGGAGGGGTCTGGAGACTGGCTGGGCTTACACACACCAAGGC 

GGGATAAGTTCAGTCGACCATGTAACCGCCGGAAAAGATCTACTGGTCTGTGA 

CAGCATGGGACGAACTAGAGTGGTTTGCCAAAGCAACAACAGGTTGACCGATG 

AGACAGAGTATGGCGTCAAGACTGACTCAGGGTGCCCAGACGGTGCCAGATG 

TTATGTGTTAAATCCAGAGGCCGTTAACATATCAGGATCCAAAGGGGCAGTCGT 

TCACCTCCAAAAGACAGGTGGAGAATTCACGTGTGTCACCGCATCAGGCACAC 

CGGCTITCTrCGACCTAAAAAACTTGAAAGGATGGTCAGGCTTGCCTATATTTG 

AAGCCTCCAGCGGGAGGGTGGTTGGCAGAGTCAAAGTAGGGAAGAATGAAGA 

GTCTAAACCTACAAAAATAATGAGTGGAATCCAGACCGTCTCAAAAAACACAGC 

AGACCTGACCGAGATGGTCAAGAAGATAACCAGCATGAACAGGGGAGACTTCA 

AGCAGATTACTTTGGCAACAGGGGCAGGCAAAACCACAGAACTCCCAAAAGCA 

GTTATAGAGGAGATAGGAAGACACAAGAGAGTATTAGTTCTTATACCATTAAGG 

GCAGCGGCAGAGTCAGTCTACCAGTATATGAGATTGAAACACCCAAGCATCTC 

TTTTAACCTAAGGATAGGGGACATGAAAGAGGGGGACATGGCAACCGGGATA 

ACCTATGCATCATACGGGTACTTCTGCCAAATGCCTCAACCAAAGCTCAGAGCT 

GCTATGGTAGAATACTCATACATATTCTTAGATGAATACCATTGTGCCACTCCTG 

AACAACTGGCAATTATCGGGAAGATCCACAGATTTTCAGAGAGTATAAGGGTT 

GTCGCCATGACTGCCACGCCAGCAGGGTCGGTGACCACAACAGGTCAAAAGC 

ACCCAATAGAGGAATTCATAGCCCCCGAGGTAATGAAAGGGGAGGATCTTGGT 

AGTCAGTTCCTTGATATAGCAGGGTTAAAAATACCAGTGGATGAGATGAAAGG 

CAATATGTTGGTTTTTGTACCAACGAGAAACATGGCAGTAGAGGTAGCAAAGA 

AGCTAAAAGCTAAGGGCTATAACTCTGGATACTATTACAGTGGAGAGGATCCA 

GCCAATCTGAGAGTTGTGACATCACAATCCCCCTATGTAATCGTGGCTACAAAT 

GCTATTGAATCAGGAGTGACACTACCAGATTTGGACACGGTTATAGACACGGG 

GTTGAAATGTGAAAAGAGGGTGAGGGTATCATCAAAGATACCCTTCATCGTAA 

CAGGCCTTAAGAGGATGGCCGTGACTGTGGGTGAGCAGGCGCAGCGTAGGGG 

CAGAGTAGGTAGAGTGAAACCCGGGAGGTATTATAGGAGCCAGGAAACAGCA 

ACAGGGTCAAAGGACrACCACTATGACCTCTTGCAGGCACAAAGATACGGGAT 

TGAGGATGGAATCAACGTGACGAAATCCTTTAGGGAGATGAATTACGATTGGA 

GCCTATACGAGGAGGACAGCCTACTAATAACCCAGCTGGAAATACTAAATAATC 

TACTCATCTCAGAAGACTTGCCAGCCGCTGTTAAGAACATAATGGCCAGGACTG 

ATCACCCAGAGCCAATCCAACTTGCATACAACAGCTATGAAGTCCAGGTCCCG 

GTCCTGTTCCCAAAAATAAGGAATGGAGAAGTCACAGACACCTACGAAAATTAC 

TCGTTTCTAAATGCCAGAAAGTTAGGGGAGGATGTGCCCGTGTATATCTACGCT 
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TCCCGCCAAGC, , JAGGTTATCACCCCTGCTGTCCAGACCAACTGGCAGAAACT 

CGAGGTCTTCTGGGCGAAGCACATGTGGAATTTCATCAGTGGGATACAATACTT 

GGCGGGCCTGTCAACGCTGCCTGGTAACCCCGCCATTGCTTCATTGATGGCTTT 

TACAGCTGCCGTCACCAGCCCACTAACCACTGGCCAAACCCTCCTCTTCAACAT 

ATTGGGGGGGTGGGTGGCTGCCCAGCTCGCCGCCCCCGGTGCCGCTACCGCC 

TTTGTGGGCGCTGGCTTAGCTGGCGCCGCCATCGGCAGCGTTGGACTGGGGA 

AGGTCCTCGTGGACATTCTTGCAGGGTATGGCGCGGGCGTGGCGGGAGCTCT 

TGTAGCCTTCAAGATCATGAGCGGTGAGGTCCCCTCCACGGAGGACCTGGTCA 

ATCTGCTGCCCGCCATCCTCTCGCCTGGAGCCCTTGTAGTCGGTGTGGTCTGC 

GCAGCAATACTGCGCCGGCACGTTGGCCCGGGCGAGGGGGCAGTGCAATGGA 

TGAACCGGCTAATAGCCTTCGCCTCCCGGGGGAACCATGTTTCCCCCACGCAC 

TACGTGCCGGAGAGCGATGCAGCCGCCCGCGTCACTGCCATACTCAGCAGCCT 

CACTGTAACCCAGCTCCTGATcgCTAGaccatggggtaccgagC GTTA CTGGCCGAAGCC 

GCTTGGAATAAGGCCGGTGTGCGTTTGTCTATATGTTATTTTCCACCATATTGCC 

GTCnTTTGGCAATGTGAGGGCCCGGAAACCTGGCCCTGTCTTCTTGACGAGCA 

TTCCTAGGGGTCTTTCCCCTCTCGCCAAAGGAATGCAAGGTCTGTTGAATGTCG 

TGAAGGAAGCAGTTCCTCTGGAAGCTTCTTGAAGACAAACAACGTCTGTAGCG 

ACCCTTTGCAGGCAGCGGAACCCCCCACCTGGCGACAGGTGCCTCTGCGGCCA 

AAAGCCACGTGTATAAGATACACCTGCAAAGGCGGCACAACCCCAGTGCCACG 

TTGTGAGTTGGATAGTTGTGGAAAGAGTCAAATGGCTCTCCTCAAGCGTATTCA 

ACAAGGGGCTGAAGGA TGCCC AGAAGGTACCCCATTGTATGGGATCTGATCTG 

GGGCCTCGGTGCACATGCTITACA TGTGTTTAGT CGAGGTTAAAAAACGTCTAG 

GCCCCCCGAACC^CGGGGACGTGGTTTTCCTTTGAAAAACACGATGATAATAT 

GGAGTTGATCACAAATGAACTnTATACAAAACATACAAACAAAAACCCGTCGG 

GGTGGAGGAACCTGTTrATGATCAGGCAGGTGATCCCTTATTTGGTGAAAGGG 

GAGCAGTCCACCCTCAATCGACGCTAAAGCTCCCACACAAGAGAGGGGAACGC 

GATGTTCCAACCAACnTGGCATCCTTACCAAAAAGAGGTGACTGCAGGT CGGG 

TAATAGCAGAGGACCTGTGAGCGGGATCTACCTGAAGCCAGGGCCACTATTTT 

ACCAGGACTATAAAGGTCCCGTCTATCACAGGGCCCCGCTGGAGCTCTTTGAG 

GAGGGATCCATGTGTGAAACGACTAAACGGATAGGGAGAGTAACTGGAAGTG 

ACGGAAAGCTGTACCACATTTATGTGTGTATAGATGGATGTATAATAATAAAAA 

GTGCCACGAGAAGTTACCAAAGGGTGTTCAGGTGGGTCCATAATAGGCTTGAC 

TGCCCTCTATGGGTCACAAGTTGCTCAGACAC GAAA GAAGAGGGAGCAACAaag 

cttGCATTGTTGGCGTGGGCAATAATAGCTATAGTTTTGTTTCAAGTTACAATGGG 

AGAAAACATAACACAGTGGAACctgcagTGGTTTGACCTGGAGGTGACTGACCAT 

CACCGGGATTACTTCGCTGAGTCCATATTAGTGGTGGTAGTAGCCCTCTTGGGT 

GGCAGATATGTACTTIXjGTTACTGGTTACATACATGGTCTTATCAGAACAGAAG 

GCCTTAGGGATTCAGTATGGATCAGGGGAAGTGGTGATGATGGGCAACTTGCT 

AACCCATAACAATATTGAAGTGGTGACATACTTCnTGCTGCTGTACCTACTGCT 

GAGGGAGGAGAGCGTAAAGAAGTGGGTCTTACTCTTATACCACATCTTAGTGG 

TACACCCAATCAAATCTGTAATTGTGATCCTACTGATGATTGGGGATGTGGTAA 

AGGCCGATTCAGGGGGCCAAGAGTACTTGGGGAAAATAGACCTCTGTTTTACA 

ACAGTAGTACTAATCGTCATAGGTTTAATCATAGCTAGGCGTGACCCAACTATA 

GTGCCACTGGTAACAATAATGGCAGCACTGAGGGTCACTGAACTGACCCACCA 

GCCTGGAGTTGACATCGCTGTGGCGGTCATGACTATAACCCTACTGATGGTTA 

GCTATGTGACAGATTATTTTAGATATAAAAAATGGTTACAGTGCATTCTCAGCCT 

GGTATCTGGGGTGTTCTTGATAAGAAGCCTAATATACCTAGGTAGAATCGAGAT 

GCCAGAGGTAACTATCCCAAACTGGAGACCACTAACnTrAATACTATTATATTTG 

ATCTCAACAACAATTGTAACGAGGTGGAAGGTTGACGTGGCTGGCCTATTGTT 

GCAATGTGTGCCTATCTTATTGCTGGTCACAACCTTGTGGGCCGACTTCTTAAC 

CCTAATACTGATCCTGCCTACCTATGAATTGGTTAAATTATACTATCTGAAAACT 

GTTAGGACTGATATAGAAAGAAGTTGGCTAGGGGGGATAGAC TATACAA GAGT 

TGACTCCATCTACGACGTTGAT GAGAG TGGAGAGGGCGTATATCTnTTCCATC 

AAGGCAGAAAGCACAGGGGAATTTTTCTATACTCTTGCCCCTTATCAAAGCAAC 



WO 99/55366 



PCTAJS99/08850 



63/67 

CGGAGCGGTCTACGCCTTCTACGGGATGTGGCCTCTCCTCCTGCTCCTGCTGG 

CGTTGCCTCAGCGGGCATACGCACTGGACACGGAGGTGGCCGCGTCGTGTGG 

CGGCGTTGTTCTTGTCGGGTTAATGGCGCTGACTCTGTCGCCATATTACAAGCG 

CrACATCAGCTGGTGCATGTGGTGGCTTCAGTATTTTCTGACCAGAGTAGAAGC 

GCAACTGCACGTGTGGGTTCCCCCCCTCAACGTCCGGGGGGGGCGCGATGCC 

GTCATCITACTCATGTGTGTTGTACACCCGACTCTGGTATTTGACATCACCAAAC 

TACTCCTGGCCATCTTCGGACCCCTTTGGATTCTTCAAGCCAGTTTGCTTAAAGT 

CCCCTACTTCGTGCGCGTTCAAGGCCTTCTCCGGATCTGCGCGCTAGCGCGGA 

AGATAGCCGGAGGTCATTACGTGCAAATGGCCATCATCAAGTTAGGGGCGCTT 

ACTGGCACCTATGTGTATAACCATCTCACCCCTCrrCGAGACTGGGCGCACAAC 

GGCCTGCGAGATCTGGCCGTGGCTGTGGAACCAGTCGTCTTCTCCCGAATGGA 

GACCAAGCTCATCACGTGGGGGGCAGATACCGCCGGGTGCGGTGACATCATC 

AACGGCTTGCCCGTCTCTGCCCGTAGGGGCCAGGAGATACTGCTTGGGCCAGC 

CGACGGAATGGTCTCCAAGGGGTGGAGGTTGCTGGCGCCCATCACGGCGTAC 

GCCCAGCAGACGAGAGGCCTCCTAGGGTGTATAATCACCAGCCTGACTGGCCG 

GGACAAAAACCAAGTGGAGGGTGAGGTCCAGATCGTGTCAACTGCTACCCAAA 

CCTTCCTGGCAACGTGCATCAATGGGGTATGCTGGACTGTCTACCACGGGGCC 

GGAACGAGGACCATCGCATCACCCAAGGGTCCTGTCATCCAGATGTATACCAA 

TGTGGACCAAGACCTTGTGGGCTGGC CCGC TCCTCAAGGTTCCCGCTCATTGA 

CACCCrGCACCTGCGGCTCCTCGGACCTTTACCTGGTCACGAGGCACGCCGAT 

GTCATTCCCGTGCGCCGGCGAGGTGATAGCAGGGGTAGCCTGCTTTCGCCCCG 

GCCCATTTCCTACTTGAAAGGCTCGTCGGGGGGTCCGCTGTTGTGCCCCGCGG 

GACACGCCG TGGG CCTATTCAGGGCCGCGGTGTGCACCCGTGGAGTGGCTAA 

GGCGGTGGACTTTATCCCTGTGGAGAACCTAGAGACAACCATGAGATCCCCGG 

TGTTCACGGACAACTCCTCTCCACCAGCAGTGCCCCAGAGCTTCCAGGTGGCC 

CACCTGCATGCTCCCACCGGCAGCGGTAAGAGCACCAAGGTCCCGGCTGCGTA 

CGCAGCCCAGGGCTACAAGGTGTTGGTGCTCAACCCCTCTGTTGCTGCAACGC 

TGGGCTTTGGTGCTTACATGTCCAAGGCCCATGGGGTTGATCCTAATATCAGGA 

CCGGGGTGAGAACAATTACCACTGGCAGCCCCATCACGTACTCCACCTACGGC 

AAGTTCCTTGCCGACGGCGGGTGCTCAGGAGGTGCTTATGACATAATAATTTGT 

GACGAGTGCCACTCCACGGATGCCACATCCATCTTGGGCATCGGCACTGTCCT 

TGACCAAGCAGAGACTGCGGGGGCGAGACTGGTTGTGCTCGCCACTGCTACC 

CCTCCGGGCTCCGTCAC TGTGTC CCATCCTAACATCGAGGAGGTTGCTCTGTCC 

ACCACCGGAGAGATCCCCTnTACGGCAAGGCTATCCCCCTCGAGGTGATCAA 

GGGGGGAAGACATCTCATCTTCTGCCACTCAAAGAAGAAGTGCGACGAGCTCG 

CCGCGAAGCTGGTCGCATTGGGCATCAATGCCGTGGCCTACTACCGCGGTCTT 

GACGTGTCTGTC ATCC CGACCAGCGGCGATGTTGTCGTCGTGTCGACCGATGC 

TCTCATGACTCKjCTTTACCGGCGACTTCGACTCTGTGATAGACTGCAACACGTG 

TGTCACTCAGACAGTCGATTTCAGCCTTGACCCTACCTTrACCATTGAGACAAC 

CACGCTCCCCCAGGATGCTGTCTCCAGGACTCAACGCCGGGGCAGGACTGGC 

AGGGGGAAGCCAGGCATCTACAGATTTGTGGCACCGGGGGAGCGCCCCTCCG 

GCATGTTCGACTCGTCCGTCCTCTGTGAGTGCTATGACGCGGGCTGTGCTTGG 

TATGAGCTCACGCCCGCCGAGACTACAGTTAGGCTACGAGCGTACATGAACAC 

CCCGGGGCTTCCCGTGTGCCAGGACCATCTTGAATTTTGGGAGGGCGTCTTTA 

CGGGC CTCA CTCATATAGATGCCCACTTTCTATCCCAGACAAAGCAGAGTGGG 

GAGAACTTTCCTTACCTGGTAGCGTACCAAGCCACCGTGTGCGCTAGGGCTCA 

AGCCCCTCCCCCATCGTGGGACCAGATGTGGAAGTGTTTGATCCGCCTTAAAC 

CCACCCTCCATGGGCCAACACCCCTGCTATACAGACTGGGCGCTGTTCAGAAT 

GAAGTCACCCTGACGCACCCAATCACCAAATACATCATGACATGCATGTCGGCC 

GACCTGGAGGTCGTCACGAGCACCTGGGTGCTCGTTGGCGGCGTCCTGGCTG 

CTCTGGCCGCGTATTGCCTGTCAACAGGCTGCGTGGTCATAGTGGGCAGGATT 

GTCTTGTCCGGGAAGCCGGCAATTATACCTGACAGGGAGGTTCTCTACCAGGA 

GTTCGATGAGATGGAAGAGTGCTCTCAGCACTTACCGTACATCGAGCAAGGGA 

TGATGCTCGCTGAGCAGTTCAAGCAGAAGGCCCTCGGCCTCCTGCAGACCGCG 
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Gtatacgagaattagaaaaggcactcgtatacgtattgggcaattaaaaataataattaggcctaggtacatggcacgtgccagccccct 

gatgggggcgacactccaccatgaatcactcccctgtgaggaactactgtcttcacgcagaaagcgtctagccatggcgttagtatgag 

tgtcgtgcagcctceaggacccccmcccgggagagccatagjg^ 

cggg^ctttcttggataaacrcgctcaatgcctggagamgggcgjtgcccccgcaagactgctagccgag^ 

aggccttgtgg^tgcctgatagggtgcttgcgagtgccccgggaggtctcgtagaccgtgcaccATGAGCACGAATC 

CTAAACCTCAAAGAAAAACCAAACGTAACACCAACCGTCGCCCACAGGACGTC 

AAGTTCCCGGGTGGCGGTCAGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAG 

GGGCCCTAGATTGGGTGTGCGCGCGACGAGGAAGACTTCCGAGCGGTCGCAA 

CCTCGAGGTAGACGTCAGCCTATCCXCAAGGCACGTCGGCCCGAGGGCAGGA 

CCTGGGCTCAGCCCGGGTACCCITGGGCCCTCTATGGCAATGAGGGTTGCGGG 

TGGGCGGGATGGCTCCTGTCTCCCCG'FGGCTCTCGGCCTAGCTGGGGCCCCAC 

AGACCCCCGGCGTAGGTCGCGCAATTTGGGTAAGGTCATCGATACCCTTACGT 

GCGGCTTCGCCGACCTCATGGGGTACAfACCGCTCGTCGGCGCCCCTCTTGGA 

GGCGCTGCCAGGGCCCTGGCGCATGGCGT CCGG GTTCTGGAAGACGGCGTGA 

ACTATGCAACAGGGAACCTTCCTGGTTGCTCTTTCTCTATCTTCCTrCTGGCC 

gctctcttck:ctgaccgtgcccgcttcagcctaccaagtgcgcaattcctcggg 
gctttaccatgtcaccaatgattgccctaactcgagtattgtgtacgaggcggc 
cgatgccatcctgcacactccggggtgtgtcccttgcgttcgcgagggtaacg 
cctcgaggtgttgggtggcggtgacccccacggtggccaccagggacggcaa 
actccccacaacgcagcttcgacgtcatatcgatctgcttgtc gggag cgcca 

CCCTCTGCTCGGCCCTCTACGTGGGGGACCTGTGCCKjGTCTGTCTTTCTTGTTG - 

GTCAACTGTTTACCTTCTCTCCCAGGCGCCACTGGACGACGCAAGACTGCAATT ' 

GTTCTATCTATCCCGGCCATATAACGGGTCATCGCATGGCATGGGATATGATGA <n 

TGAACTGGTCCCCTACGGCAGCGTTGGTGGTAGCTCAGCTGCTCCGGATCCCA W 

caagccatcatggacatgatcgctggtgctcactggggagtcctggcgggcat K 

AGCGTATTTCTCCATGGTGGGGAACTGGGCGAAGGTCCTGGTAGTGCTGCTGC 3 

TATTTGCCGGCGTCGACGCGGAAACCCACGTCACCGGGGGAAGTGCCGGCCG g 

CACCACGGCTGGGCTTGTTGGTCTCCTTACACCAGGCGCCAAGCAGAACATCC * 

AACTGATCAACACCAACGGCAGTTGGCACATCAATAGCACGGCCTTGAACTGC 

AATGAAAGCCTTAACACCGGCTGGTTAGCAGGGCTCTTCTATCAGCACAA ATTC 

AACnXTITCAGGCTGTCCTGAGAGGTTGGCCAGCTGCCGACGCCTTACCGATTTT 

GCCCAGGGCTGGGGTCCTATCAGTTATGCCAACGGAAGCGGCCTCGACGAAC 

GCCCCTACTGCTGGCACTACCCrcCAAGACCTTGTGGCATTGTGCCCGCAAAG 

AGCGTGTGTGGCCCGGTATATTGCTTCACTCCCAGCCCCGTGGTGGTGGGAAC 

GACCGACAGGTCGGGCGCGCCTACCTACAGCTGGGGTGCAAATGATACGGAT 

GTCTTCGTCCTTAACAACACCAGGCCACCGCTGGGCAATTGGTTCGGTTGTACC 

TGGATGAACTCAACTGGATTCACCAAAGTGTGCGGAGCGCCCCCTTGTGTCAT 

CGGAGGGGTGGGCAACAACACCTTGCTCTGCCCCACTGATTGTTTCCGCAAGC 

ATCCGGAAGCCACATACTCTCGGT GCGG CTCCGGTCCCTGGATTACACCCAGG 

TGCATGGTCGACTAC(XGTATAGGCTTTGGCACTATCCTTGTACCATCAATTAC 

ACCATATrCAAAGTCAGGATGTACGTGGGAGGGGTCGAGCACAGGCTGGAAG 

CGGCCTGCAACTGGACGCGGGGCGAACGCTGTGATCTGGAAGACAGGGACAG 

GTCCGAGCTCAGCCCATTGCTGCTGTCCACCACACAGTGGCAGGTCCTTCCGT 

GTTCTTTCACGACCCTGCCAGCCITGTCCACCGGCCTCATCCACCTCCACCAGA 

ACATTGTGGACGTGCAGTACTTGTACGGGGTAGGGTCAAGCATCGCGTCCTGG 

GCCATTAAGTGGGAGTACGTCGTTCTCCTGTTCCTCCTGCTTGCAGACG CGCG C 

GTCTGCTCCTGCTTGTGGATGATGTTACTCATATCCCAAGCGGAGGCGGCTTTG 

GAGAACCTCGTAATACTCAATGCAGCATCCCTGGCCGGGACGCACGGTCTTGT 

GTCCTTCCTCGTGTTCTTCTGCTTTGCGTGGTATCTGAAGGGTAGGTGGGTGCC 



WO 99/55366 



PCT/US99/08850 




WO 99/55366 



PCT/US99/08850 



60/67 

GTCATGGGGGTTGGGACTGAGAGATACAAGTTAGGTCCCATAGTCAATCTGCT 

GCTGAGAAGGTTGAAAATTCTGCTCATGACGGCCGTCGGCGTCAGCAGCTGAg 

acaaaatgtatatattgtaaataaattaatccatgtacAATTCCGCCCCTCTCCCTCCCCCCCCCCTAACG 

TTACTGGCCGAAGCCGC TTGGA ATAAGGCCGGTGTGCGTTTGTCTATATGTTAT 

TTTCCACCATATTGCCGTCTTTTGGCAATGTGAGGGCCCGGAAACCTGGCCCTG 

TCTTCTTGACGAGCATTCCTAGGGGTCTTTCCCCTCTCGCCAAAGGAATGCAAG 

GTCTGTTGAATGTCGTGAAGGAAGCAGTTCCTCTGGAAGCTTCTTGAAGACAAA 

CAACGTCTGTAGCGACCCTTTGCAGGCAGCGGAACCCCCCACCTGGCGACAGG 

TGCCTCTGCGGCCAAAAGCCACGTGTATAAGATACACCTGCAAAGGCGGCACA 

ACCCCAGTGCCACGTTGTGAGTTGGATAGTTGTGGAAAGAGTCAAATGGCTCT 

CCTCAAGCGTATTCAACAAGGGGCTGAAGGATGCCCAGAAGGTACCCCATTGT 

ATGGGATCTGATCTGGGGCCTCGGTGCACATGCTTTACATGTGTTTAGTCGAG 

GTTAAAAAACGTCTAGGCCCCCCGAACCACGGGGACGTGGTTTTCCTTTGAAA 

AACACGATGATAAGCTTGCCACAACcatgaccgagtacaagcccacggtgcgcctcgccacccgcgacga 

cgtcccrcgggccgtacgcaccctcgccgccgcgttcgccga 

gagcgggtcaccgagctgcaagaactcttcctcacgcgcgtcgggctcgacatcggcaaggtgtgggtcgcggacgacggcgcc 

gcggtggcggtctggaccacgccggagagcgtcgaagcgggggcggtgttcgccgagatcggcccgcgcatggccgagttgag 

cggttcccggctggccgcgcagcaacagatggaaggcctcctggcgccgcaccggcccaaggagcccgcgtggttcctggccac 

cgtcggcg^tcgcccgaccaccagggcaagggtctgggcagcgccgtcgtgctccccggagtggaggcggccgagcgcgccg 

gggtgcccgccttcctggagacctccgcgccccgcaacctccccttctacgagcggctcggcttcaccgtcmcrgccgacgtcgagt 

gcccgaaggaccgcgcgacctggtgcatgacccgcaagcccggtgccTGAcgcccgccccacgacccgcagcgcccgaccg 

aaaggagcgcacgaccccatgaaATGCATCGATCGTACGAATTAACGCCGACAGGCTGATAT 

CCAGCAAAACTGGCCACTTATACATACCTGATAAAGGCTTTACATTACAAGGAA 

AGCATTATGAGCAACTGCAGCTAAGAACAGAGACAAACCCGGTCATGGGGGTT 

GGGACTGAGAGATACAAGTTAGGTCCCATAGTCAATCTGCTGCTGAGAAGGTT 

GAAAATTCTGCTCATGACGGCCGTCGGCGTCAGCAGCTGAgacaaaatgtatatattgtaaata 

aattaatccatgtacatagtgtatataaatatagttgggaccgt^ 

tcaagattatctacctcaagataacactacatttaatgcacacagcactttagctgtatgaggatacgcccgacgtctatagttggactagg 
gaagacctctaacagccccc 
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aaatacccaccgctacggtcaccacatggctagcttacaccttcgtgaatgaag 

acgtagggactataaaaccagtactaggagagagagtaatccccgaccctgta 

gttgatatcaatttacaaccagaggtck:aagtggacacgtcagaggttgggat 

cacaataattggaagggaaaccctgatgacaacgggagtgacacctgtcttgg 

aaaaagtagagcctgacgccagcgacaaccaaaactcggtgaagatcgggttg 

gatgagggtaattacccagggcctggaatacagacacatacactaacagaaga 

aatacacaacagggatgcgaggcccttcatcatgatcctgggctcaaggaatt 

ccatatcaaatagggcaaagactgctagaaatataaatctgtacacaggaaatg 

accccagggaaatacgagacttgatggctgcagggcgcatgttagtagtagca 

ctgagggatgtcgaccctgagctgtctgaaatggtcgattrcaaggggacttt 

tttagatagggaggccctggaggctctaagtctcgggcaacctaaaccgaagc 

AGGTTACCAAGGAAGCTGTTAGGAATTTGATAGAACAGAAAAAAGATGTGGAG 

ATCCCTAACTGGTTTGCATCAGATGACCCAGTATTTCTGGAAGTGGCCTTAAAA 

AATGATAAGTACTACTTAGTAGGAGATGTTGGAGAGGTAAAAGATCAAGCTAA 

AGCACTTGGGGCCACGGATCAGACAAGAATTATAAAGGAGGTAGGCTCAAGG 

ACGTATGCCATGAAGCTATCTAGCTGGTTCCTCAAGGCATCAAACAAACAGATG 

AGTTTAACTCCACTGTTTGAGGAATTGTTGCTACGGTGCCCACCTGCAACTAAG 

AGCAATAAGGGGCACATGGCATCAGCTTACCAATTGGCACAGGGTAACTGGGA 

GCCCCTCGGTTGCGGGGTGCACCTAGGTACAATACCAGCCAGAAGGGTGAAG 

ATACACCCATATGAAGCTTACCroAAGTTGAAAGATTTCATAGAAGAAGAAGAG 

AAGAAACCTAGGGTTAAGGATACAGTAATAAGAGAGCACAACAAATGGATACT 

TAAAAAAATAAGGTTTCAAGGAAACCTCAACACCAAGAAAATGCTCAACCCTGG 

GAAACTATCTGAACAGTTGGACAGGGAGGGGCGCAAGAGGAACATCTACAAC 

CACCAGATTGGTACTATAATGTCAAGTGCAGGCATAAGGCTGGAGAAATTGCC 

AATAGTGAGGGCCCAAACCGACACCAAAACCTTTCATGAGGCAATAAGAGATA 

AGATAGACAAGAGTGAAAACCGGCAAAATCCAGAATTGCACAACAAATTGTTG 

GAGATTTTCCACACGATAGCCCAACCCACCCTGAAACACACCTACGGTGAGGT _ 

GACGTGGGAGCAACTTGAGGCGGGGATAAATAGAAAGGGGGCAGCAGGCTTC 3 

CTGGAGAAGAAGAACATCGGAGAAGTATTGGATTCAGAAAAGCACCTGGTAGA ^ 

ACAATTGGTCAGGGATCTGAAGGCCGGGAGAAAGATAAAATATTATGAAACTG fid 

CAATACCAAAAAATGAGAAGAGAGATGTCAGTGATGACTGGCAGGCAGGGGA » 

CCTGGTGGTTGAGAAGAGGCCAAGAGTTATCCAATACCCTGAAGCCAAGACAA £ 

GGCTAGCCATCACTAAGGTCATGTATAACTGGGTGAAACAGCAGCCCGTTGTG g 

ATTCCAGGATATGAAGGAAAGACCCCCTTGTTCAACATCTTTGATAAAGTGAGA * 

AAGGAATGGGACTCGTTCAATGAGCCAGTGGCCGTAAGTTTTGACACCAAAGC 

CTGGGACACTCAAGTGACTAGTAAGGATCTGCAACTTATTGGAGAAATCCAGA 

AATATTACTATAAGAAGGAGTGGCACAAGTTCATTGACACCATCACCGACCACA 

TGACAGAAGTACCAGTTATAACAGCAGATGGTGAAGTATATATAAGAAATGGG 

CAGAGAGGGAGCGGCCAGCCAGACACAAGTGCTGGCAACAGCATGTTAAATG 

TCCTGACAATGATGTACGCCTTCTGCGAAAGCACAGGGGTACCGTACAAGAGT 

TTC AAC AGGGTGGC AAGG ATCC ACGTCTGTGGGG ATG ATGGC I " 1 L" l'l AAT AAC 

TGAAAAAGGGTTAGGGCTGAAATTTGCTAACAAAGGGATGCAGATTCTTCATG 

AAGCAGGCAAACCTCAGAAGATAACGGAAGGGGAAAAGATGAAAGTTGCCTAT 

AGATTTGAGGATATAGAGTTCTGTTCTCATACCCCAGTCCCTGTTAGGTGGTCC 

GACAACACCAGTAGTCACATGGCCGGGAGAGACACCGCTGTGATACTATCAAA 

GATGGCAACAAGATTGGATTCAAGTGGAGAGAGGGGTACCACAGCATATGAAA 

AAGCGGTAGCCTTCAGTTTCTTGCTGATGTATTCCTGGAACCCGCTTGTTAGGA 

GGATTTGCCTGTTGGTCCTTTCGCAACAGCCAGAGACAGACCCATCAAAACATG 

CCACTTATTATTACAAAGGTGATCCAATAGGGGCCTATAAAGATGTAATAGGTC 

GGAATCTAAGTGAACTGAAGAGAACAGGCTTTGAGAAATTGGCAAATCTAAAC 

CTAAGCCTGTCCACGTTGGGGATCTGGACTAAGCACACAAGCAAAAGAATAAT 

TCAGGACTGTGTTGCCATTGGGAAAGAAGAGGGCAACTGGCTAGTTAACGCCG 

ACAGGCTGATATCCAGCAAAACTGGCCACTTATACATACCTGATAAAGGCTTTA 

CATTACAAGGAAAGCATTATGAGCAACTGCAGCTAAGAACAGAGACAAACCCG 
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CAAAAGCACCCAATAGAGGAATTCATAGCCCCCGAGGTAATGAAAGGGGAGG 

ATCTTGGTAGTCAGTTCCTTGATATAGCAGGGTTAAAAATACCAGTGGATGAGA 

TGAAAGGCAATATGTTGGTTTTTGTACCAACGAGAAACATGGCAGTAGAGGTA 

GCAAAGAAGCTAAAAGCTAAGGGCTATAACTCTGGATACTATTACAGTGGAGA 

GGATCCAGCCAATCTGAGAGTTGTGACATCACAATCCCCCTATGTAATCGTGGC 

TACAAATGCTATTGAATCAGGAGTGACACTACCAGATTTGGACACGGTTATAGA 

CACGGGGTTGAAATGTGAAAAGAGGGTGAGGGTATCATCAAAGATACCCTTCA 

TCGTAACAGGCCTTAAGAGGATGGCCGTGACTGTGGGTGAGCAGGCGCAGCG 

TAGGGGCAGAGTAGGTAGAGTGAAACCCGGGAGGTATTATAGGAGCCAGGAA 

ACAGCAACAGGGTCAAAGGACTACCACTATGACCTCTTGCAGGCACAAAGATA 

CGGGATTGAGGATGGAATCAACGTGACGAAATCCTTTAGGGAGATGAATTACG 

ATTGGAGCCTATACGAGGAGGACAGCCTACTAATAACCCAGCTGGAAATACTA 

AATAATCTACTCATCTCAGAAGACTTGCCAGCCGCTGTTAAGAACATAATGGCC 

AGGACTGATCACCCAGAGCCAATCCAACTTGCATACAACAGCTATGAAGTCCA 

GGTCCCGGTCCTGTTCCCAAAAATAAGGAATGGAGAAGTGACAGACACCTACG 

AAAATTACTCGTTTCTAAATGCCAGAAAGTTAGGGGAGGATGTGCCCGTGTATA 

TCTACGCTACTGAAGATGAGGATCTGGCAGTTGACCTCTTAGGGCTAGACTGG 

CCTGATCCTGGGAACCAGCAGGTAGTGGAGACTGGTAAAGCACTGAAGCAAGT 

GACCGGGTrGTCCTCGGCTGAAAATGCCCTACTAGTGGCTTTATTTGGGTATGT 

GGGTTACCAGGCTCTCTCAAAGAGGCATGTCCCAATGATAACAGACATATATAC 

CATCGAGGACCAGAGACTAGAAGACACCACCCACCTCCAGTATGCACCCAACG 

CCATAAAAACCGATGGGACAGAGACTGAACTGAAAGAACTGGCGTCGGGTGA 

CGTGGAAAAAATCATGGGAGCCATTTCAGATTATG CAGC TGGGGGACTGGAGT 

TTGTTAAATCCCAAGCAGAAAAGATAAAAACAGCTCCTITGTTTAAAGAAAACG 

CAGAAGCCGCAAAAGGGTATOTCCAAAAATTCATTGACTCATTAATTGAAAATA 

AAGAAGAAATAATCAGATATGGTTTGTGGGGAACACACACAGCACTATACAAA 

AGCATACKTGCAAGACTCKjGGCATGAAACAGCGTTTGCCACACTAGTGTTAAA <? 

GTGGCTAGCTTTTGGAGGGGAATCAGTGTCAGACCACGTCAAGCAGGCGGCA ^ 

GTTGATTTAGTGGTCTATTATGTGATGAATAAGCCTTCCTTCCCAGGTGACTCC a 

GAGACACAGCAAGAAGGGAGGCGATTCGTCGCAAGCCTGTTCATCTCCGCACT g 

GGCAACCTACACATACAAAACTTGGAATTACCACAATCTCTCTAAAGTGGTGGA a 

ACCAGCCCTGGCTTACCTCCCCTATGCTACCAGCGCATTAAAAATGTTCACCCC o 

AACGCGGCTGGAGAGCGTGGTGATACTGAGCACCACGATATATAAAACATACC £ 

TCTCTATAAGGAAGGGGAAGAGTGATGGATTGCTGGGTACGGGGATAAGTGC 

AGCCATGGAAATCCTGTCACAAAACCCAGTATCGGTAGGTATATCTGTGATGTT 

GGGGGTAGGGGCAATCGCTGCGCACAACGCTATTGAGTCCAGTGAACAGAAA 

AGGACCCTACTTATGAAGGTGTTTGTAAAGAACTTCTTGGATCAGGCTGCAACA 

GATGAGCTGGTAAAAGAAAACCCAGAAAAAATTATAATGGCCTTATTTGAAGCA 

GTCCAGACAATTGGTAACCCCCTGAGACTAATATACCACCTGTATGGGGTTTAC 

TACAAAGGTTGGGAGGCCAAGGAACTATCTGAGAGGACAGCAGGCAGAAACT 

TATTCACATTGATAATGTTTGAAGCCTTCGAGTTATTAGGGATGGACTCACAAG 

GGAAAATAAGGAACCTGTCCGGAAATTACATTTTGGATTTGATATACGGCCTAC 

ACAAGCAAATCAACAGAGGGCTGAAGAAAATGGTACTGGGGTGGGCCCCTGC 

ACCCTTTAGTTGTGACTGGACCCCTAGTGACGAGAGGATCAGATTGCCAACAG 

ACAACTATTTGAGGGTAGAAACCAGGTGCCCATGTGGCTATGAGAT GAAAG CT 

TTCAAAAATGTAGGTGGCAAACTTACCAAAGTGGAGGAGAGCGGGCCTTTCCT 

ATGTAGAAACAGACCTGGTAGGGGACCAGTCAACTACAGAGTCACCAAGTATT 

ACGATGACAACCTCAGAGAGATAAAACCAGTAGCAAAGTTGGAAGGACAGGTA 

GAGCACTACTACAAAGGGGTCACAGCAAAAATTGACTACAGTAAAGGAAAAAT 

GCTCTTGGCCACTGACAAGTGGGAGGTGGAACATGGTGTCATAACCAGGTTAG 

CTAAGAGATATACTGGGGTCGGGTTCAATGGTGCATACTTAGGTGACGAGCCC 

AATCACCGTGCTCTAGTGGAGAGGGACTGTGCAACTATAACCAAAAACACAGT 

ACAGTTTCTAAAAATGAAGAAGGGGTGTGCGTTCACCTATGACCTGACCATCTC 

CAATCTGACCAGGCTCATCGAACTAGTACACAGGAACAATCTTGAAGAGAAGG 
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GTCCTTCCTCGTGTTCTTCTGCTTTGCGTGGTATCTGAAGGGTAGGTGGGTGCC 

CGGAGCGGTCTACGCCITCTACGGGAAGTGGGTCTTACTCTTATACCACATCTT 

AGTGGTACACCCAATCAAATCTGTAATTGTGATCCTACTGATGATTGGGGATGT 

GGTAAAGKjCCGATTCAGGGGGCCAAGAGTACITGGGGAAAATAGACCTCTGTT 

TTACAACAGTAGTACTAATCGTCATAGGTTTAATCATAGCTAGGCGTGACCCAA 

CTATAGTGCCACTGGTAACAATAATGGCAGCACTGAGGGTCACTGAACTGACC 

CACCAGCCTGGAGTTGACAT CGCTG TGGCGGTCATGACTATAACCCTACTGAT 

GGTTAGCTATGTGACAGATTATTTTAGATATAAAAAATGGTTACAGTGCATTCTC 

AGCCTGGTATCTGCGGTGTTCTTGATAAGAAGCCTAATAT ACCTA GGTAGAATC 

GAGATGCCAGAGGTAACTATCCCAAACTGGAGACCACTAACrTTAATACTATTA 

TATTTGATCTCAACAACAATTGTAACGAGGTGGAAGGTTGACGTGGCTGGCCTA 

TTGTTGCAATGTGTGCCTATCTTATTGCTGGTCACAACCITGTGGGCCGACTTCT 

TAACCCTAATACTGATCCTGCCTACCTATGAATTGGTTAAATTATACTATCTGAA 

AACTGTTAGGACTGATACAGAAAGAAGTTGGCTAGGGGGGATAGACTATACAA 

GAGTTGACTCCATCTACGACGTTGA TGAGAG TGGAGAGGGCGTATATCITTTTC 

CATCAAGGCAGAAAGCACAGGGGAATTTTTCTATACTCTTGCCCCTTATCAAAG 

CAACACTGATAAGTTGCGTCAGCAGTAAATGGCAGCTAATATACATGAGTTACT 

TAACTTTGGACTTTATGTACTACATGCACACK3AAAGTTATAGAAGAGATCTCAG 

GAGGTACCAACATAATATCCAGGTTAGTGGCAGCACTCATAGAGCTGAACTGG 

TCCATGGAAGAAGAGGAGAGCAAAGGCTTAAAGAAGTTTTATCTATTGTCTGG 

AAGGTTGAGAAACCTAATAATAAAACATAAGGTAAGGAATGAGACCGTGGCTT 

CTTGGTACGGGGAGGAGGAAGTCTACGGTATGCCAAAGATCATGACTATAATC 

AAGGCCAGTACACTGAGTAAGAGCAGGCACTGCATAATATGCACTGTATGTGA 

GGGCCGAGAGTGGAAAGGTGGCACCTGCCCAAAATGTGGACGCCATGGGAAG 

CCGATAACGTGTGGGATGTCGCTAGCAGATTTTGAAGAAAGACACTATAAAAG 

AATCTTTATAAGGGAAGGCAACTTTGAGGGTATGTGCAGCCGATGCCAGGGAA 

AGCATAGGAGGTTTGAAATGGACCGGGAACCTAAGAGTGCCAGATACTGTGCT 

GAGTGTAATAGGCTGCATCCTGCTGAGGAAGGTGACTTTTGGGCAGAGTCGAG 

CATGTTGGGCCTCAAAATCACCTACTTTGCGCTGATGGATGGAAAGGTGTATGA 

TATCACAGAGTGGGCTGGATGCCAGCGTGTGGGAA TCTC CCCAGATACCCACA 

GAGTCCCTTGTCACATCTCATTTGGTTCACGGATGCCTTTCAGGCAGGAATACA 

ATGGCTITGTACAATATACCGCTAGGGGGCAACTATTTCTGAGAAACTTGCCCG 

TACTGGCAACTAAAGTAAAAATGCrCATGGTAGGCAACCTTGGAGAAGAAATT 

GGTAATCTGGAACATCTTGGGTGGATCCTAAGGGGGCCTGCCGTGTGTAAGAA 

GATCACAGAGCACGAAAAATGCCACATTAATATACTGGATAAACTAACCGCATT 

TTTCGGGATCATGCCAAGGGCKjACTACACCCAGAGCCCCGGTGAGGTTCCCTA 

CGAGCTTACTAAAAGTGAGGAGGGGTCTGGAGACTGCCTGGGCTTACACACAC 

CAAGGCGGGATAAGTTCAGTCGACCATGTAACCGCCGGAAAAGATCTACTGGT 

CTGTGACAGCATGGGACGAACTAGAGTGGTTTGCCAAAGCAACAACAGGTTGA 

CCGATGAGACAGAGTATGGCGTCAAGACTGACTCAGGGTGCCCAGACGGTGC 

cagatgttatgtgttaaatccagaggccgttaacatatcaggatccaaagggg 

cagtcgttcacctccaaaagacaggtggagaattcacgtgtgtcaccgcatca 

ggcacaccggctttctrcgacctaaaaaacttgaaaggatggtcaggcttgcct 

atatttgaagcctccagcgggagggtggttggcagagtcaaagtagggaaga 

atgaagagtctaaacctacaaaaataatgagtggaatccagaccgtctcaaaaa 

acagagcagacctgaccgagatggtcaagaagataaccagcatgaacagggg 

agacttcaagcagattactttggcaacaggggcaggcaaaaccacagaactcc 

CAAAAGCAGTTATAGAGGAGATAGGAAGACACAAGAGAGTATTAGTTCTTATA 

CCATTAAGGGCAGCGGCAGAGTCAGTCTACCAGTATATGAGATTGAAACACCC 

AAGCATCTCTTTTAACCTAAGGATAGGGGACATGAAAGAGGGGGACATGGCAA 

CCGGGATAACCTATGCATCATACGGGTACTrCTGCCAAATGCCTCAACCAAAGC 

TCAGAGCTGCTATGGTAGAATACTCATACATATTCTTAGA TGAAT ACCATTGTGC 

CACTCCTGAACAACTGGCAATTATCGGGAAGATCCACAGATTTTCAGAGAGTAT 

AAGGGTTGTCGCCATGACTGCCACGCCAGCAGGGTCGGTGACCACAACAGGT 
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Gtatacgagaattagaaaaggcactcgtatacgtattgggcaattaaaaataataattaggcctaggtacatggcacgtgccagccccct 

gatgggggcgacactccaccatgaatcactcccctgtgaggaactactgtcttcacgcagaaagcgtctagccatggcgttagtatgag 

tgtcgtgcagcctccaggaccccccctcccgggagagccatagtggtctgcggaaccggtgagtacaccggaattgccaggacgac 

cgggtcctttcttggataaacccgctcaatgcctggagatttgggcgtgcccccgcaagactgctagccgagtagtgttgggtcgcgaa 

aggccttgtggtactgcctgatagggtgcttgcgagtgccccgggaggtctcgtagaccgtgcaccATGAGCACGAATC 

CTAAACCTCAAAGAAAAACCAAACGTAACACCAACCGTCGCCCACAGGACGTC 

AAGTTCCCGGGTGGCGGTCAGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAG 

GGGCCCTAGATTGGGTGTGCGCGCGACGAGGAAGACTTCCGAGCGGTCGCAA 

CCTCGAGGTAGACGTCAGCCTATCCCCAAGGCACGTCGGCCCGAGGGCAGGA 

CCTGGGCTCAGCCCGGGTACCCTTGGCCCCTCTATGGCAATGAGGGTTGCGGG 

TGGGCGGGATGGCTCCTGTCTCCCCGTGGCTCTCGGCCTAGCTGGGGCCCCAC 

AGACCCCCGGCGTAGGTCGCGCAATTTGGGTAAGGTCATCGATACCCTTACGT 

GCGGCTTCGCCGACCTCATGGGGTACATACCGCTCGTCGGCGCCCCTCTTGGA 

GGCGCTGCCAGGGCCCTGGCGCATGGCGTCCGGGTTCTGGAAGACGGCGTGA 

ACTATGCAACAGGGAACCITCCTGGTTGCIXnTrCTCTATCTTCCTTCTGGCCCT 

GCTCTCTTGCCTGACCGTGCCCCiCTTCAGCCTACCAAGTGCGCAATTCCTCGGG 

GCTTTACCATGTCACCAATGATTGCCCTAACTCGAGTATTGTGTACGAGGCGGC 

CGATGCCATCCTGCACACTCCGGGGTGTGTCCCTTGCGTTCGCGAGGGTAACG 

CCTCGAGGTGTTGGGTGGCGGTGACCCCCACGGTGGCCACCAGGGACGGCAA 

ACTCCCCACAACGCAGCTTCGACGTCATATCGATCTGCTTGTC GGGAG CGCCA 

CCCTCTGCTCGGCCCTCTACGTGGGGGACCTGTGCGGGTCTGTCTTTCTTGTTG 

GTCAACrGTTTACCTTCrCTCCCAGGCGCCACTGGACGACGCAAGACTGCAATT 

GTTCTATCTATCCCGGCCATATAACGGGTCATCGCATGGCATGGGATATGATGA 

TGAACTGGTCCCCTACGGCAGCGTTGGTGGTAGCTCAGCTGCTCCGGATCCCA 

CAAGCCATCATGGACATGATCGCTGGTGCTCACTGGGGAGTCCTGGCGGGCAT 

AGCGTATTTCTCCATGGTGGGGAACTGGGCGAAGGTCCTGGTAGTGCTGCTGC 

TATTTGCCGGCGTCGACGCGGAAACCCACGTCACCGGGGGAAGTGCCGGCCG 

CACCACGGCTGGGCTTGTTGGTCTCCTTACACCAGGCGCCAAGCAGAACATCC 

AACTGATCAACACCAACGGCAGTTGGCACATCAATAGCACGGCCTTGAACTGC 

AATGAAAGCCTTAACACCGGCTGGTTAGCAGGGCTCn'CTATCAGCACAAATTC 

AACTCTTCAGGCTGTCCTGAGAGGTTGGCCAGCTGCCGACGCCrTACCGATTTT 

GCCCAGGGCTGGGGTCCTATCAGTTATGCCAACGGAAGCGGCCTCGACGAAC 

GCCCCTACTGCTGGCACTACCCTCCAAGACCTTGTGGCATTGTGCCCGCAAAG 

AGCGTGTGTGGCCCGGTATATTGCTTCACTCCCAGCCCCGTGGTGGTGGGAAC 

GACCGACAGGTCGGGCGCGCCTACCTACAGCTGGGGTGCAAATGATACGGAT 

GTCTTCGTCCTTAACAACACCAGGCCACCGCTGGGCAATTGGTTCGGTTGTACC 

TGGATGAACTCAACTGGATTCACCAAAGTGTGCGGAGCGCCCCCTTGTGTCAT 

CGGAGGGGTGGGCAACAACACCITGCTCTGCCCCACTGATTGTTTCCGCAAGC 

ATCCGGAAGCCACATACTCTCGGTGCGGCTCCGGTCCCTGGATTACACCCAGG 

TGCATGGTCGACTACCCGTATAGGCTTTGGCACTATCCTTGTACCATCAATTAC 

ACCATATTCAAAGTCAGGATGTACGTGGGAGGGGTCGAGCACAGGCTGGAAG 

CGGCCTGCAACTGGACGCGGGGCGAACGCTGTGATCTGGAAGACAGGGACAG 

GTCCGAGCTCAGCCCATTGCTGCTGTCCACCACACAGTGGCAGGTCCTTCCGT 

GTTCTTTCACGACCCTGCCAGCCTTGTCCACCGGCCTCATCCACCTCCACCAGA 

ACATTGTGGACGTGCAGTACTTGTACGGGGTAGGGTCAAGCATCGCGTCCTGG 

GCCATTAAGTGGGAGTACGTCGTTCTCCTGTTCCTCCTGCTTGCAGACGCGCGC 

GTCTGCTCCTGCTTGTGGATGATGTTACTCATATCCCAAGCGGAGGCGGCrTTG 

GAGAACCTCGTAATACTCAATGCAGCATCCCTGGCCGGGACGCACGGTCTTGT 
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AAGGTTGAAAATTCTGCTCATGACGGCCGTCGGCGTCAGCAGCTGAgacaaaatgtat 
atattgtaaataaattaatccatgtacatagtgtatataaatatagttgggaccgtccacctcaagaagacgacacgcccaacacgcacag 
ctaaacagtagtcaagattatctacctcaagataacactacatttaatgcacacagcactttagctgtatgaggatacgcccgacgtctatag 
ttggactagggaagacctctaacagccccc 
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ACGTAGGGACTATAAAACCAGTACTAGGAGAGAGAGTAATCCCCGACCCTGTA 

GTTGATATCAATTTACAACCAGAGGTGCAAGTGGACACGTCAGAGGTTGGGAT 

CACAATAATTGGAAGGGAAACCCTGATGACAACGGGAGTGACACCTGTCTTGG 

AAAAAGTAGAGCCTGACGCCAGCGACAACCAAAACTCGGTGAAGATCGGGTTG 

GATGAGGGTAATTACCCAGGGCCTGGAATACAGACACATACACTAACAGAAGA 

AATACACAACAGGGATGCGAGGCCCTTCATCATGATCCTGGGCTCAAGGAATT 

CCATATCAAATAGGGCAAAGACTGCTAGAAATATAAATCTGTACACAGGAAATG 

ACCCCAGGGAAATACGAGACTTGATGGCTGCAGGGCGCATGTTAGTAGTAGCA 

CTGAGGGATGTCGACCCTGAGCTGTCTGAAATGGTCGATTTCAAGGGGACnT 

TTTAGATAGGGAGGCCCTGGAGGCTCTAAGTCTCGGGCAACCTAAACCGAAGC 

AGGTTACCAAGGAAGCTGTTAGGAATTTGATAGAACAGAAAAAAGATGTGGAG 

ATCCCTAACTGGTTTGCATCAGATGACCCAGTATTTCTGGAAGTGGCCTTAAAA 

AATGATAAGTACTACTTAGTAGGAGATGTTGGAGAGCTAAAAGATCAAGCTAAA 

GCACTTGGGGCCACGGATCAGACAAGAATTATAAAGGAGGTAGGCTCAAGGA 

CGTATGCCATGAAGCTATCTAGCTGGTTCCTCAAGGCATCAAACAAACAGATGA 

GTTTAACTCCACTGTTTGAGGAATTGTTGCTACGGTGCCCACCTGCAACTAAGA 

GCAATAAGGGGCACATGGCATCAGCTTACCAATTGGCACAGGGTAACTGGGAG 

CCCCTCGGTTGCGGGGTGCACCTAGGTACAATACCAGCCAGAAGGGTGAAGAT 

ACACCCATATGAAGCTTACCTGAAGTTGAAAGATTTCATAGAAGAAGAAGAGAA 

GAAACCTAGGGTTAAGGATAGAGTAATAAGAGAGCACAACAAATGGATACTTA 

AAAAAATAAGGTTTCAAGGAAACCTCAACACCAAGAAAATGCTCAACCCAGGG 

AAACTATCTGAACAGTTGGACAGGGAGGGGCGCAAGAGGAACATCTACAACCA 

CCAGATTGGTACTATAATGTCAAGTGCAGGCATAAGGCTGGAGAAATTGCCAA 

TAGTGAGGGCCCAAACCGACACCAAAACCTTTCATGAGGCAATAAGAGATAAG 

ATAGACAAGAGTGAAAACCGGCAAAATCCAGAATTGCACAACAAATTGTTGGA 

GATTTTCCACACGATAGCCCAACCCACCCTGAAACACACCTACGGTGAGGTGA 

CGTGGGAGCAACTTGAGGCGGGGGTAAATAGAAAGGGGGCAGCAGGCTTCCT 

GGAGAAGAAGAACATCGGAGAAGTATTGGATTCAGAAAAGCACCTGGTAGAAC 

AATTGGTCAGGGATCTGAAGGCCGGGAGAAAGATAAAATATTATGAAACTGCA 

ATACCAAAAAATGAGAAGAGAGATGTCAGTGATGACTGGCAGGCAGGGGACC 

TGGTGGTTGAGAAGAGGCCAAGAGTTATCCAATACCCTGAAGCCAAGACAAGG 

CTAGCCATCACTAAGGTCATGTATAACTGGGTGAAACAGCAGCCCGTTGTGATT 

CCAGGATATGAAGGAAAGACCCCCTrGTrCAACATCTTrGATAAAGTGAGAAAG 

GAATGGGACTCGTTCAATGAGCCAGTGGCCGTAAGTTTTGACACCAAAGCCTG 

GGACACTCAAGTGACTAGTAAGGATCTGCAACTTATTGGAGAAATCCAGAAATA 

TTACTATAAGAAGGAGTGGCACAAGTTCATTGACACCATCACCGACCACATGAC 

AGAAGTACCAGTTATAACAGCAGATGGTGAAGTATATATAAGAAATGGGCAGA 

GAGGGAGCGGCCAGCCAGACACAAGTGCTGGCAACAGCATGTTAAATGTCCT 

GACAATGATGTACGGCTTCTGCGAAAGCACAGGGGTACCGTACAAGAGTTTCA 

ACAGGGTGGCAAGGATCCACGTCTGTGGGGATGATGGCTTCTTAATAACTGAA 

AAAGGGTrAGGGCTGAAATTTGCTAACAAAGGGATGCAGATTCTTCATGAAGC 

AGGCAAACCTCAGAAGATAACGGAAGGGGAAAAGATGAAAGTTGCCTA^GAT 

TTGAGGATATAGAGTTCTGTTCTCATACCCCAGTCCCTGTTAGGTGGTCCGACA 

ACACCAGTAGTCACATGGCCGGGAGAGACACCGCTGTGATACTATCAAAGATG 

GCAACAAGATTGGATTCAAGTGGAGAGAGGGGTACCACAGCATATGAAAAAGC 

GGTAGCCTrCAGTTTCTTGCTGATGTATTCCTGGAACCCGCTTGTTAGGAGGAT 

TTGCCTGTTGGTCCTTTCGCAACAGCCAGAGACAGACCCATCAAAACATGCCAC 

TTATTATTACAAAGGTGATCCAATAGGGGCCTATAAAGATGTAATAGGTGGGAA 

TCTAAGTGAACTGAAGAGAACAGGCTTTGAGAAATTGGCAAATCTAAACCTAAG 

CCTGTCCACGTTGGGGGTCTGGACTAAGCACACAAGCAAAAGAATAATTCAGG 

ACTGTGTTGCCATTGGGAAAGAAGAGGGCAACTGGCTAGTTAAGCCCGACAGG 

CTGATATCCAGCAAAACTGGCCACTTATACATACCTGATAAAGGCTTTAC^ATTAC 

AAGGAAAGCATTATGAGCAACTGCAGCTAAGAACAGAGACAAACCCGGTCATG 

GGGGTTGGGACTGAGAGATACAAGTTAGGTCCCATAGTCAATCTGCTGCTGAG 



